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Figure 8. The HA-DFNa expression cassette in pSAC35. The expression cassette 
comprises 

PRB\ promoter, from S. cerevisiae. 

Fusion leader, first 19 amino, acids of the HA leader followed by the last 6 amino acids of 
the MFa-1 leader. 

HA-IFNa coding sequence with a double stop codon (TAATAA) 

ADHX terminator, from S. cerevisiae. Modified to remove all the coding sequence 

normaly present in the Hind KUBamHI fragment generally used. 



Figure 8 
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Localisation of 'Loops' based on the HA Crystal Structure 
which could be used for Mutation/Insertion 



1 DAHKSEVAHR FKDLGEENFK ALVLIAFAQY LQQCPFEDHV KLVNEVTEFA 

HHHHH HHH HHH HHHHHHHHHH HHHHH HHHHHHHHHH 

I II III 

51 KTCV ADESAK N CDKSLHTLF GDKLC TVATL RET YGEM AD C CAKOEPERNE 
HHHHH " " ~" HHHHH HHHHH -~— HHHH H HHHH 

101 CFLQHKDDNP NLPRLVRPEV DVMCTAFHDN EETFLKKYLY EIARRHPYFY 
HHHH H HHHHHHHH HHHHHHHHH HHHHH 

IV 

151 APELLFFAKR YKAAFTECC O AADKAA CLLP KLDELRDEGK AS S AKQRLKC 
HHHHHHHHHH HHHHHHHHH HHHHH HHHEHHHHHH HHHHHHHHHH 

V 

2 01 ASLQKFGERA FKAWAVARLS QRFPKAEFAE VSKLVTDLTK VHTECCHGDL 

HHHHH HH HHHHHHHHHH HH HHH HHHHHHHHHH HHHHHH HH 

VI VII 

2 51 LECADDRADL AKYI C ENODS ISSKL KECC E KPL LEKSHCI AEVENDEMPA 

HHHHHHHHHH HHHHH HHHHH HHHHHHH H 

3 01 DLPSLAADFV E S KDVCKNYA EAKDVFLGMF LYEYARRHPD YSWLLLRLA 

HHHH HHHHHH HHHHHHH HHHHHH HHHHHHHH 

VIII 

3 51 KTYETTLEKC CAAADPHECY AKVFDEFKPL VEEPQNLIKQ NCELFEQLGE 

HHHHHHHHHH HH H HHHHH - HHHHHHHHHH HHHHHHH 

IX 

4 01 YKFQNALLVR YTKKVPQVST PTLVEVSRNL GKVGSKCC KH PEA KRMP CAE 

" HHHHHHHHHH HHHH H HHHHHHHHHH HHH HHHHHHHH 

X .XI 
451 - DYLS WLNQL CVLHEKTPVS DRVTKCCT ES LVNR RPPCFSA LEVDETYVPK 
HHHHHHHHHH HHHHH HHHHHHHHH HHHHHHHH 

501 EFNAETFTFH ADICTLSEKE RQIKKQTALV ELVKHKPKAT KEQLKAVMDD 

HHH HHH HHHHMIVIEHHH HHH HHHHHHHH 

XII 

5 51 FAAFVEKCC K ADDKET CFAE EGKKLVAASQ AALGL 

HHHHHHHH HHHH HHHHHHHHHH HH 



Loop 

I Val54-Asn61 

II Thr7 6-Asp8 9 

III Ala92-Glul00 

IV Glnl70-Alal76 

V His247-Glu252 
VI~ Glu266-Glu277 



Loop 






VII 


Glu280 


-His288 


VIII 


Ala3 62 


-Glu3 68 


IX 


Lys43 9 


-Pro44 7 


X 


Val462 


-Lys475 


XI 


Thr47 8 


-Pro486 


XII 


Lys56 0 


-T2ir566 
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Examples of Modifications to Loop IV 



a. Randomisation of Loop IV. 

IV 

151 APELLFFAKR YKAAFTECC Q AADKAACLLP KLDELRDEGK ASSAKQRLKC 
HHHHHHHHHH HHHHHHHHH HHHHH HHHHHHHHHH HHHHHHHHHH 

IV 

151 APELLFFAKR YKAAFTSCC X XXXXXX CLLP KLDELRDEGK ASSAKQRLKC 
HHHHHHHHHH HHHHHHHHH HHHHH HHHHHHHHHH HHHHHHHHHH 



X represents the mutation of the natural amino acid to any 
other amino acid. One, more or all of the amino acids can 
be changed in this manner. This figure indicates all the 
residues have been changed. 

b. Insertion (or replacement) of Randomised sequence into Loop IV. 

(X) a 
IV 

151 APELLFFAKR YKAAFTECC Q AADKAA CLLP KLDELRDEGK ASSAKQRLKC 
' ■ HHHHHHHHHH HHHHHHHHH HHHHH HHHHHHHHHH HHHHHHHHHH 

The insertion can be at any point on the loop and the 
length a length where n would typically be 6, 8, 12/ 20 

■ or 25. * ' 



Figure 10 
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Figure 11 
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SEQUENCE LISTING 



<110> Delta Biotechnology Limited 

Principia Pharmaceutical Corporation 

<120> Albumin Fusion Proteins 

<130> PF542PCT 

<140> Unassigned 
<141> 2001-04-12 

<150> 60/229,358 
<151> 2000-04-12 

<150> 60/256,931 
<151> 2000-12-21 

<150> 60/199,384 
<151> 2000-04-25 

<160> 37 

<17 0> Patentln Ver. 2.1 

<210> 1 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> primer__bind 

<223> primer useful to clone human growth hormone cDNA 
<400> 1 

cccaagaatt cccttatcca ggc 23 



<210> 2 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> primer_bind 

<223> primer useful to clone human growth hormone cDNA 
<400> 2 

gggaagctta gaagccacag gatccctcca cag 33 



<210> 3 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 



1 
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<221> misc_structure 

<223> synthetic oligonucleotide used to join DNA fragments 
with non-cohesive ends . 

<400> 3 

gataaagatt cccaac 16 



<210> 4 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc_structure 

<223> synthetic oligonucleotide used to join DNA fragments 
with non-cohesive ends . 

<400> 4 

aattgttggg aatcttt 17 



<210> 5 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc__structure 

<223> synthetic oligonucleotide used to join DNA fragments 
with non- cohesive ends . 

<400> 5 

ttaggcttat tcccaac 17 



<210> 6 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc_structure 

<223> synthetic oligonucleotide used to join DNA fragments 
with non-cohesive ends . 

<400> 6 

aattgttggg aataagcc 18 



<210> 7 
<211> 24 
<212> PRT 

<213> Artificial Sequence 
<220> 

<221> SITE 
<222> 1) . . (19) 

<223> invertase leader sequence 
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<220> 

<221> SITE 
<222> 20) . . (24) 

<223> first 5 amino acids of mature human serum albumin 
<400> 7 

Met Leu Leu Gin Ala Phe Leu Phe Leu Leu Ala Gly Phe Ala Ala Lys 
15 10 15 

lie Ser Ala Asp Ala His Lys Ser 
20 



<210> 8 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc_structure 

<223> synthetic oligonucleotide used to join DNA 
fragments with, non-cohesive ends . 



<210> 9 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc_structure 

<223> synthetic oligonucleotide used to join DNA 
fragments with non- cohesive ends. 

<400> 9 

gatcctgtgg cttcgatgca cacaaga 27 



<210> 10 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc_structure 

<223> synthetic oligonucleotide used to join DNA 
fragments with non- cohesive ends. 



<400> 8 

gagatgcaca cctgagtgag g 



21 



<400> 10 

ctcttgtgtg catcgaagcc acag 



24 



<210> 11 
<211> 30 
<212> DNA 



<213> Artificial Sequence 
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<220> 

<221> misc_structure 

<223> synthetic oligonucleotide used to join DNA 
fragments with non-cohesive ends. 

<400> 11 

tgtggaagag cctcagaatt tattcccaac 30 



<210> 12 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc„structure 

<223> synthetic oligonucleotide used to join DNA 
fragments with non- cohesive ends . 



<210> 13 
<211> 47 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc^structure 

<223> synthetic oligonucleotide used to join DNA 
fragments with non- cohesive ends . 

<400> 13 

ttaggcttag gtggcggtgg atccggcggt ggtggatctt tcccaac 47 



<210> 14 
<211> 48 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc_structure 

<223> synthetic oligonucleotide used to join DNA 
fragments with non-cohesive ends . 



<210> 15 
<211> 62 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc^structure 

<223> synthetic oligonucleotide used to join DNA 
fragments with non-cohesive ends. 



<400> 12 

aattgttggg aataaattct gaggctcttc c 



31 



<400> 14 

aattgttggg aaagatccac caccgccgga tccaccgcca cctaagcc 



48 
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<400> 15 

ttaggcttag gcggtggtgg atctggtggc ggcggatctg gtggcggtgg atccttccca 60 
ac 62 



<210> 16 
<211> 63 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc_structure 

<223> synthetic oligonucleotide used to join DNA 
fragments with non-cohesive ends . 

<400> 16 

aattgttggg aaggatccac cgccaccaga tccgccgcca ccagatccac caccgcctaa 60 
gcc 63 



<210> 17 
<211> 1782 
<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1) . . (1755) 



<400> 17 

gat gca cac aag agt gag gtt get cat egg ttt aaa gat ttg gga gaa 48 
Asp Ala His Lys Ser Glu Val Ala His Arg Phe Lys Asp Leu Gly Glu 
15 10 15 

gaa aat ttc aaa gcc ttg gtg ttg att gcc ttt get cag tat ctt cag 96 
Glu Asn Phe Lys Ala Leu Val Leu lie Ala Phe Ala Gin Tyr Leu Gin 
20 25 30 

cag tgt cca ttt gaa gat cat gta aaa tta gtg aat gaa gta act gaa 144 
Gin Cys Pro Phe Glu Asp His Val Lys Leu Val Asn Glu Val Thr Glu 
35 40 45 

ttt gca aaa aca tgt gtt get gat gag tea get gaa aat tgt gac aaa 192 
Phe Ala Lys Thr Cys Val Ala Asp Glu Ser Ala Glu Asn Cys Asp Lys 
50 55 60 

tea ctt cat acc ctt ttt gga gac aaa tta tgc aca gtt gca act ctt 240 
Ser Leu His Thr Leu Phe Gly Asp Lys Leu Cys Thr Val Ala Thr Leu 
65 70 75 80 

cgt gaa acc tat ggt gaa atg get gac tgc tgt gca aaa caa gaa cct 288 
Arg Glu Thr Tyr Gly Glu Met Ala Asp Cys Cys Ala Lys Gin Glu Pro 
85 90 95 

gag aga aat gaa tgc ttc ttg caa cac aaa gat gac aac cca aac etc 336 
Glu Arg Asn Glu Cys Phe Leu Gin His Lys Asp Asp Asn Pro Asn Leu 
100 105 110 
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ccc cga ttg gtg aga cca gag gtt gat gtg atg tgc act get ttt cat 3 84 
Pro Arg Leu Val Arg Pro Glu Val Asp Val Met Cys Thr Ala Phe His 
115 120 125 

gac aat gaa gag aca ttt ttg aaa aaa tac tta tat gaa att gec aga 432 
Asp Asn Glu Glu Thr Phe Leu Lys Lys Tyr Leu Tyr Glu lie Ala Arg 
130 135 140 

aga cat cct tac ttt tat gec ccg gaa etc ctt ttc ttt get aaa agg 480 
Arg His Pro Tyr Phe Tyr Ala Pro Glu Leu Leu Phe Phe Ala Lys Arg 
145 150 155 160 

tat aaa get get ttt aca gaa tgt tgc caa get get gat aaa get gee 528 
Tyr Lys Ala Ala Phd Thr Glu Cys Cys Gin Ala Ala Asp Lys Ala Ala 
165 170 175 

tgc ctg ttg cca aag etc gat gaa ctt egg gat gaa ggg aag get teg 57 6 
Cys Leu Leu Pro Lys Leu Asp Glu Leu Arg Asp Glu Gly Lys Ala Ser 
180 185 190 

tct gec aaa cag aga etc aaa tgt gec agt etc caa aaa ttt gga gaa 624 
Ser Ala Lys Gin Arg Leu Lys Cys Ala Ser Leu Gin Lys Phe Gly Glu 
195 200 205 

aga get ttc aaa gca tgg gca gtg get cgc ctg age cag aga ttt ccc 672 
Arg Ala Phe Lys Ala Trp Ala Val Ala Arg Leu Ser Gin Arg Phe Pro 
210 215 220 

aaa get gag ttt gca gaa gtt tec aag tta gtg aca gat ctt ace aaa 720 
Lys Ala Glu Phe Ala Glu Val Ser Lys Leu Val Thr Asp Leu Thr Lys 
225 230 235 240 

gtc cac acg gaa tgc tgc cat gga gat ctg ctt gaa tgt get gat gac 7 68 
Val His Thr Glu Cys Cys His Gly Asp Leu Leu Glu Cys Ala Asp Asp 
245 250 255 

agg gcg gac ctt gec aag tat ate tgt gaa aat cag gat teg ate tec 816 
Arg Ala Asp Leu Ala Lys Tyr lie Cys Glu Asn Gin Asp Ser lie Ser 
260 265 270 

agt aaa ctg aag gaa tgc tgt gaa aaa cct ctg ttg gaa aaa tec cac 864 
Ser Lys Leu Lys Glu Cys Cys Glu Lys Pro Leu Leu Glu Lys Ser His 
275 280 285 

tgc att gee gaa gtg gaa aat gat gag atg cct get gac ttg cct tea 912 
Cys lie Ala Glu Val Glu Asn Asp Glu Met Pro Ala Asp Leu Pro Ser 
290 295 300 

tta get get gat ttt gtt gaa agt aag gat gtt tgc aaa aac tat get 960 
Leu Ala Ala Asp Phe Val Glu Ser Lys Asp Val Cys Lys Asn Tyr Ala 
305 310 315 320 

gag gca aag gat gtc ttc ctg ggc atg ttt ttg tat gaa tat gca aga 1008 
Glu Ala Lys Asp Val Phe Leu Gly Met Phe Leu Tyr Glu Tyr Ala Arg 
325 330 335 

agg cat cct gat tac tct gtc gtg ctg ctg ctg aga ctt gec aag aca 1056 
Arg His Pro Asp Tyr Ser Val Val Leu Leu Leu Arg Leu Ala Lys Thr 
340 345 350 
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tat gaa acc act eta gag aag tgc tgt gec get gca gat cct cat gaa 1104 
Tyr Glu Thr Thr Leu Glu Lys Cys Cys Ala Ala Ala Asp Pro His Glu 
355 360 365 

tgc tat gec aaa gtg ttc gat gaa ttt aaa cct ctt gtg gaa gag cct 1152 
Cys Tyr Ala Lys Val Phe Asp Glu Phe Lys Pro Leu Val Glu Glu Pro 
370 375 380 

cag aat tta ate aaa caa aac tgt gag ctt ttt gag cag ctt gga gag 12 00 
Gin Asn Leu lie Lys Gin Asn Cys Glu Leu Phe Glu Gin Leu Gly Glu 
385 390 395 400 

tac aaa ttc cag aat gcg eta tta gtt cgt tac acc aag aaa gta ccc 1248 
Tyr Lys Phe Gin Asn Ala Leu Leu Val Arg Tyr Thr Lys Lys Val Pro 
405 410 415 

caa gtg tea act cca act ctt gta gag gtc tea aga aac eta gga aaa 1296 
Gin Val Ser Thr Pro Thr Leu Val Glu Val Ser Arg Asn Leu Gly Lys 
420 425 430 

gtg ggc age aaa tgt tgt aaa cat cct gaa gca aaa aga atg ccc tgt 1344 
Val Gly Ser Lys Cys Cys Lys His Pro Glu Ala Lys Arg Met Pro Cys 
435 440 445 

gca gaa gac tat eta tec gtg gtc ctg aac cag tta tgt gtg ttg cat 1392 
Ala Glu Asp Tyr Leu Ser Val Val Leu Asn Gin Leu Cys Val Leu His 
450 455 460 

gag aaa acg cca gta agt gac aga gtc aca aaa tgc tgc aca gag tec 1440 
Glu Lys Thr Pro Val Ser Asp Arg Val Thr Lys Cys Cys Thr Glu Ser 
465 470 475 480 

ttg gtg aac agg cga cca tgc ttt tea get ctg gaa gtc gat gaa aca 1488 
Leu Val Asn Arg Arg Pro Cys Phe Ser Ala Leu Glu Val Asp Glu Thr 
485 490 495 

tac gtt ccc aaa gag ttt aat get gaa aca ttc acc ttc cat gca gat 1536 
Tyr Val Pro Lys Glu Phe Asn Ala Glu Thr Phe Thr Phe His Ala Asp 
500 505 510 

ata tgc aca ctt tct gag aag gag aga caa ate aag aaa caa act gca 1584 
lie Cys Thr Leu Ser Glu Lys Glu Arg Gin He Lys Lys Gin Thr Ala 
515 520 525 

ctt gtt gag ctt gtg aaa cac aag ccc aag gca aca aaa gag caa ctg 1632 
Leu Val Glu Leu Val Lys His Lys Pro Lys Ala Thr Lys Glu Gin Leu 
530 535 540 

aaa get gtt atg gat gat ttc gca get ttt gta gag aag tgc tgc aag 1680 
Lys Ala Val Met Asp Asp Phe Ala Ala Phe Val Glu Lys Cys Cys Lys 
545 550 555 560 

get gac gat aag gag acc tgc ttt gec gag gag ggt aaa aaa ctt gtt 1728 
Ala Asp Asp Lys Glu Thr Cys Phe Ala Glu Glu Gly Lys Lys Leu Val 
565 570 575 

get gca agt caa get gec tta ggc tta taacatctac atttaaaagc atctcag 1782 
Ala Ala Ser Gin Ala Ala Leu Gly Leu 
580 585 
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<210> 18 
<211> 585 
<212> PRT 

<213> Homo Sapiens 
<400> 18 

Asp Ala His Lys Ser Glu Val Ala His Arg Phe Lys Asp Leu Gly Glu 
15 10 15 

Glu Asn Phe Lys Ala Leu Val Leu lie Ala Phe Ala Gin Tyr Leu Gin 
20 25 30 

Gin Cys Pro Phe Glu Asp His Val Lys Leu Val Asn Glu Val Thr Glu 
35 40 45 

Phe Ala Lys Thr Cys Val Ala Asp Glu Ser Ala Glu Asn Cys Asp Lys 
50 55 60 

Ser Leu His Thr Leu Phe Gly Asp Lys Leu Cys Thr Val Ala Thr Leu 
65 70 75 80 

Arg Glu Thr Tyr Gly Glu Met Ala Asp Cys Cys Ala Lys Gin Glu Pro 
85 90 95 

Glu Arg Asn Glu Cys Phe Leu Gin His Lys Asp Asp Asn Pro Asn Leu 
100 105 110 

Pro Arg Leu Val Arg Pro Glu Val Asp Val Met Cys Thr Ala Phe His 
115 120 125 

Asp Asn Glu Glu Thr Phe Leu Lys Lys Tyr Leu Tyr Glu lie Ala Arg 
130 135 140 

Arg His Pro Tyr Phe Tyr Ala Pro Glu Leu Leu Phe Phe Ala Lys Arg 
145 150 155 160 

Tyr Lys Ala Ala Phe Thr Glu Cys Cys Gin Ala Ala Asp Lys Ala Ala 
165 170 175 

Cys Leu Leu Pro Lys Leu Asp Glu Leu Arg Asp Glu Gly Lys Ala Ser 
180 185 190 

Ser Ala Lys Gin Arg Leu Lys Cys Ala Ser Leu Gin Lys Phe Gly Glu 
195 200 205 

Arg Ala Phe Lys Ala Trp Ala Val Ala Arg Leu Ser Gin Arg Phe Pro 
210 215 220 

Lys Ala Glu Phe Ala Glu Val Ser Lys Leu Val Thr Asp Leu Thr Lys 
225 230 235 240 

Val His Thr Glu Cys Cys His Gly Asp Leu Leu Glu Cys Ala Asp Asp 
245 250 255 

Arg Ala Asp Leu Ala Lys Tyr lie Cys Glu Asn Gin Asp Ser lie Ser 
260 265 270 

Ser Lys Leu Lys Glu Cys Cys Glu Lys Pro Leu Leu Glu Lys Ser His 
275 280 285 
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Cys lie Ala Glu Val Glu Asn Asp Glu Met Pro Ala Asp Leu Pro Ser 
290 295 300 

Leu Ala Ala Asp Phe Val Glu Ser Lys Asp Val Cys Lys Asn Tyr Ala 
305 310 315 320 

Glu Ala Lys Asp Val Phe Leu Gly Met Phe Leu Tyr Glu Tyr Ala Arg 
325 330 335 

Arg His Pro Asp Tyr Ser Val Val Leu Leu Leu Arg Leu Ala Lys Thr 
340 345 350 

Tyr Glu Thr Thr Leu Glu Lys Cys Cys Ala Ala Ala Asp Pro His Glu 
355 360 365 

Cys Tyr Ala Lys Val Phe Asp Glu Phe Lys Pro Leu Val Glu Glu Pro 
370 375 380 

Gin Asn Leu He Lys Gin Asn Cys Glu Leu Phe Glu Gin Leu Gly Glu 
385 390 395 400 

Tyr Lys Phe Gin Asn Ala Leu Leu Val Arg Tyr Thr Lys Lys Val Pro 
405 410 415 

Gin Val Ser Thr Pro Thr Leu Val Glu Val Ser Arg Asn Leu Gly Lys 
420 425 430 

Val Gly Ser Lys Cys Cys Lys His Pro Glu Ala Lys Arg Met Pro Cys 
435 440 445 

Ala Glu Asp Tyr Leu Ser Val Val Leu Asn Gin Leu Cys Val Leu His 
450 455 460 

Glu Lys Thr Pro Val Ser Asp Arg Val Thr Lys Cys Cys Thr Glu Ser 
465 470 475 480 

Leu Val Asn Arg Arg Pro Cys Phe Ser Ala Leu Glu Val Asp Glu Thr 
485 490 495 

Tyr Val Pro Lys Glu Phe Asn Ala Glu Thr Phe Thr Phe His Ala Asp 
500 505 510 

lie Cys Thr Leu Ser Glu Lys Glu Arg Gin He Lys Lys Gin Thr Ala 
515 520 525 

Leu Val Glu Leu Val Lys His Lys Pro Lys Ala Thr Lys Glu Gin Leu 
530 535 540 

Lys Ala Val Met Asp Asp Phe Ala Ala Phe Val Glu Lys Cys Cys Lys 
545 550 555 560 

Ala Asp Asp Lys Glu Thr Cys Phe Ala Glu Glu Gly Lys Lys Leu Val 
565 570 575 

Ala Ala Ser Gin Ala Ala Leu Gly Leu 
580 585 
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<210> 19 
<211> 57 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> primer_bind 

<223> primer used to generate Xhol and Clal 
site in pPPC0006 

<400> 19 

gcctcgagaa aagagatgca cacaagagtg aggttgctca tcgatttaaa gatttgg 57 



<210> 20 
<211> 58 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> primer_bind 

<223> primer used in generation Xhol and Clal 
site in pPPC0006 

<400> 20 

aatcgatgag caacctcact cttgtgtgca tctcttttct cgaggctcct ggaataag 58 



<210> 21 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> primer_bind 

<223> primer used in generation Xhol and Clal 
site in pPPC0006 

<400> 21 

tacaaactta agagtccaat tagc 24 

<210> 22 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> primer__bind 

<223> primer used in generation Xhol and Clal 
site in pPPC0006 

<400> 22 

cacttctcta gagtggtttc atatgtctt 2 9 



<210> 23 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<221> Misc__Structure 

<223> Synthetic oligonucleotide used to alter restriction 
sites in pPPC0007 

<400> 23 

aagctgcctt aggcttataa taaggcgcgc cggccggccg tttaaactaa gcttaattct 60 



<210> 24 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> Misc_Structure 

<223> Synthetic oligonucleotide used to alter restriction 
sites in pPPC0007 

<400> 24 

agaattaagc ttagtttaaa cggccggccg gcgcgcctta ttataagcct aaggcagctt 60 



<210> 25 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> primer_bind 

<223> forward primer useful for generation of albumin 
fusion protein in which the albumin moiety is N- terminal 
of the Therapeutic Protein 

<220> 

<221> misc_feature 
<222> (18) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (19) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (20) 

<223> n equals a,t,g, or c 
<220> 

<221> miscjeature 
<222> (21) 

<223> n equals a,t,g, or c 
<220> 

<221> misc^f eature 
<222> (22) 

<223> n equals a,t,g, or c 
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<221> misc_f eature 
<222> (23) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (24) 

<223> n equals a,t,g, or c 
<220> 

<221> mi sc_f eature 
<222> (25) 

<223> ii equals a,t,g, or c 
<220> 

<221> mi sc_f eature 
<222> (26) 

<223> ii equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (27) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (28) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (29) 

<223> n equals a,t,g, or c 
<220> 

<221> mi sc_f eature 
<222> (30) 

<223> n equals a,t,g, or c 
<220> 

<221> misc__f eature 
<222> (31) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
<222> (32) 

<223> n equals a,t,g, or c 
<400> 25 

aagctgcctt aggcttannn nnnnnnnunn nn 

<210> 26 
<211> 51 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> primer_bind 
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32 
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<223> reverse primer useful for generation of albumin 
fusion protein in which the albumin moiety is N- terminal 
of the Therapeutic Protein 

<220> 

<221> misc_feature 
<222> (37) 

<223> n equals a,t,g;, or c 
<220> 

<221> misc_feature 
<222> (38) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (39) 

<223> n equals a,t,g, or c 
<220> 

<221> miscjeature 
<222> (40) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (41) 

<223> n equals a,t,g, or c 
<220> 

<221> miscjeature 
<222> (42) 

<223> n equals a,t,g, or c 
<220> 

<221> misc„feature 
<222> (43) 

<223> n equals a,t,g, or c 
<220> 

<221> misc^feature 
<222> (44) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (45) 

<223> n equals a 7 t,g, or c 
<220> 

<221> misc_feature 
<222> (46) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<:222> (47) 

<223> n equals a,t,g, or c 
<220> 
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<221> misc_f eature 
<222> (48) 

<223> n equals a,t,g, or c 
<220> 

<221> m±sc_f eature 
<222> (49) 

< 2 2 3 > n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (50) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
<222> (51) 

<223> n equals a,t,g, or c 
<400> 26 

gcgcgcgttt aaacggccgg ccggcgcgcc ttattannnn nnnnnnnnnn n 51 



<210> 27 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> forward primer useful for generation of albumin fusion 
protein in which the albumin moiety is c-terminal of the 
Therapeutic Protein 

<220> 

<221> mi sc_f eature 
<222> (19) 

<223> n equals a,t,g, or c 
<220> 

<221> mi sc_f eature 
<222> (20) 

<223> n equals a,t,g, or c 
<220> 

<221> mi sc_f eature 
<222> (21) 

<223> n equals a,t,g, or c 
<220> 

<221> misc^feature 
<222> (22) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (23) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
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222> (24) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
<222> (25) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (26) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (27) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (28) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (29) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (30) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
<222> (31) 

<223> n equals a,t,g, or c 
<220> 

<221> mi sc_f eature 
222> (32) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (33) 

<223> n equals a,t,g, or c 
<400> 27 

aggagcgtcg acaaaagann nnnnnnnnnn nnn 33 



<210> 28 
<211> 52 
<212> DRTA 

<213> Artificial Sequence 
<220> 

<2 21> primer_bind 

<223> reverse primer useful for generation of albumin 
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fusion protein in which the albumin moiety is c-terminal of 
the Therapeutic Protein 

<220> 

<221> misc_feature 
<222> (38) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (39) 

<223> n equals a,t,g, or c 
<220> 

<221> misc^feature 
<222> (40) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (41) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (42) 

<223> n equals a,fc,g, or c 
<220> 

<221> misc_feature 
<222> (43) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (44) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
<222> (45) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
<222> (46) 

<223> n equals a,t,g, or c 
<220> 

<221> miscjeature 
<222> (47) 

<223> n equals a,t,g, or c 
<220> 

<221> misc__f eature 
<222> (48) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
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<222> (49) 

<223> n equals a,t,g, or c 
<220> 

<221> misc__f eature 
<222> (50) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (51) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
<222> (52) 

<223> n equals a,t,g, or c 
<400> 28 

ctttaaatcg atgagcaacc tcactcttgt gtgcatcnnn imnnrmimxm nn 52 



<210> 29 
<211> 24 
<212> PRT 

<213> Artificial Sequence 
<220> 

<221> signal 

<223> signal peptide of natural human serum albumin protein 
<400> 29 

Met Lys Trp Val Ser Phe lie Ser Leu Leu Phe Leu Phe Ser Ser Ala 
15 10 15 

Tyr Ser Arg Ser Leu Asp Lys Arg 
20 

<210> 30 
<211> 114 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> prime r_bind 

<223> forward primer useful for generation of PC4:HSA 
albumin fusion VECTOR 

<220> 

<221> mi sc_f eature 

<222> (5) . . (10) 

<223> BamHI retsriction site 

<220> 

<221> misc_f eature 
<222> (11) . . (16) 

<223> Hind III retsriction site 
<220> 

<221> misc__f eature 
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<222> (17) . . (27) 
<223> Kozak sequence 

<220> 

<221> misc_f eature 
<222> (25) . . (97) 

<223> cds natural signal sequence of human serum albumin 
<220> 

<221> misc_f eature 

<222> (75) . . (81) 

<223> Xhol restriction site 

<220> 

<221> misc__f eature 
<222> (98) . . (114) 

<223> cds first six amino acids of human serum albumin 
<400> 30 

tcagggatcc aagcttccgc caccatgaag tgggtaacct ttatttccct tctttttctc 60 

tttagctcgg cttactcgag gggtgtgttt cgtcgagatg cacacaagag tgag 114 

<210> 31 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> primer__bind 

<223> reverse primer useful for generation of 
PC4.-HSA albumin fusion VECTOR 



<220> 

<221> misc_feature 

<222> (6) . . (11) 

<223> Asp718 restriction site 

<220> 

<221> misc_feature 

<222> (12) . . (17) 

<223> EcoRI restriction site 

<220> 

<221> misc^feature 
<222> (15) . . (17) 

<223> reverse complement of stop codon 
<220> 

<221> misc_feature 

<222> (18) . . (25) 

<223> AscI restriction site 

<220> 

<221> misc_feature 
<222> (18) . . (43) 

<223> reverse complement of DNA sequence encoding last 9 amino acids 
<400> 31 
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gcagcggtac cgaattcggc gcgccttata agcctaaggc age 

<210> 32 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> primer_bind 

<223> forward primer useful for inserting Therapeutic 
protein into pC4:HSA vector 

<220> 

<221> misc_f eature 
<222> (29) 

<223> n equals a,t,g, or c 
<220> 

<221> misc__f eature 
<222> (30) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
<222> (31) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (32) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
<222> (33) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (34) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (35) 

<223> n equals a,t,g, or c 
<220> 

<221> mi sc_f eature 
<222> (36) 

<223> n equals a,t,g, or c 
<220> 

<221> mi sc__f eature 
<222> (37) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
<222> (38) 
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<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (39) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (40) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (41) 

<223> n equals a,t,g, or c 
<220> 

<221> mi sc_f eature 
<222> (42) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
<222> (43) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (44) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (45) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (46) 

<223> n equals a,t,g, or c 
<400> 32 

ccgccgctcg aggggtgtgt ttcgtcgann nnnnnnnnnn nnnnnn 46 

<210> 33 
<211> 55 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> primer_bind 

<223> reverse primer useful for inserting Therapeutic 
protein into pC4:HSA vector 

<220> 

<221> misc_f eature 
<222> (38) 

<223> n equals a,t,g, or c 
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<220> 

<221> misc_f eature 
<222> (39) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (40) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (41) 

<223> n equals a,t,g, or c 
<220> 

<221> mi sc_f eature 
<222> (42) 

<223> n equals a,t,g, or c 
<220> 

<221> mi sc__f eature 
<222> (43) 

<223> n equals a,t,g, or c 
<220> 

<221> miscjeature 
<222> (44) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (45) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (46) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (47) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (48) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_feature 
<222> (49) 

<223> n equals a # t,g # or c 
<220> 

<221> miscjeature 
<222> (50) 

<223> n equals a,t,g, or c 
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<220> 

<221> misc__f eature 
<222> (51) 

<223> n equals a,t,g, or c 
<220> 

<221> mis cofeature 
<222> (52) 

<223> n equals a,t,g, or c 
<220> 

<221> miscjeature 
<222> (53) 

<223> n equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (54) 

<223> ri equals a,t,g, or c 
<220> 

<221> misc_f eature 
<222> (55) 

<223> n equals a,t,g, or c 
<400> 33 

agtcccatcg atgagcaacc tcactcttgt gtgcatcnnn nnnnnnnnnn nnnnn 55 

<210> 34 
<211> 17 
<212> PUT 

<213> Artificial Sequence 
<220> 

<221> signal 

<223> Stanniocalcin signal peptide 
<400> 34 

Met Leu Gin Asn Ser Ala Val Leu Leu Leu Leu Val lie Ser Ala Ser 
15 10 15 

Ala 

<210> 35 
<211> 22 
<212> PRT 

<213> Artificial Sequence 
<220> 

<221> signal 

<223> Synthetic signal peptide 
<400> 35 

Met Pro Thr Trp Ala Trp Trp Leu Phe Leu Val Leu Leu Leu Ala Leu 
15 10 15 

Trp Ala Pro Ala Arg Gly 
20 

<210> 36 
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<211> 66 
<212> PRT 

<213> Agkistrodon piscivorus 
<400> 36 

lie Thr Tyr Thr Asp Cys Thr Glu Ser Gly Gin Asn Leu Cys Leu Cys 
15 10 15 

Glu Gly Ser Asn Val Cys Gly Lys Gly Asn Lys Cys lie Leu Gly Ser 
20 25 30 

Gin Gly Lys Asp Asn Gin Cys Val Thr Gly Glu Gly Thr Pro Lys Pro 
35 40 45 

Gin Ser His Asn Gin Gly Asp Phe Glu Pro lie Pro Glu Asp Ala Tyr 
50 55 60 

Asp Glu 
65 

<210> 37 
<211> 71 
<212> PRT ' 

<213> Agkistrodon piscivorus 
<400> 37 

Glu Ala Gly Glu Glu Cys Asp Cys Gly Ser Pro Glu Asn Pro Cys Cys 
15 10 15 

Asp Ala Ala Thr Cys Lys Leu Arg Pro Gly Ala Gin Cys Ala Glu Gly 
20 25 30 

Leu Cys Cys Asp Gin Cys Lys Phe Met Lys Glu Gly Thr Val Cys Arg 
35 40 45 

Ala Arg Gly Asp Asp Val Asn Asp Tyr Cys Asn Gly lie Ser Ala Gly 
50 55 60 

Cys Pro Arg Asn Pro Phe His 
65 70 
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ALBUMIN FUSION PROTEINS 

5 



10 

BACKGROUND OF THE INVENTION 

The invention relates generally to Therapeutic proteins (including, but not limited to, a 
polypeptide, antibody, or peptide, or fragments and variants thereof) fused to albumin or 
fragments or variants of albumin. The invention further relates to Therapeutic proteins 

15 (including, but not limited to, a polypeptide, antibody, or peptide, or fragments and variants 
thereof) fused to albumin or fragments or variants of albumin, that exhibit extended shelf-life 
and/or extended or therapeutic activity in solution. These fusion proteins are herein 
collectively referred to as "albumin fusion proteins of the invention." The invention 
encompasses therapeutic albumin fusion proteins, compositions, pharmaceutical 

20 compositions, formulations and kits. Nucleic . acid molecules encoding the albumin fusion 
proteins of the invention are also encompassed by the invention, as are vectors containing 
these nucleic acids, host cells transformed with these nucleic acids vectors, and methods of 
making the albumin fusion proteins of the invention using these nucleic acids, vectors, and/or 
host cells. 

25 The invention is also directed to methods of in vitro stabilizing a Therapeutic protein 

via fusion or conjugation of the Therapeutic protein to albumin or fragments or variants of 
albumin. ■ - 

Human serum albumin (HSA, or HA), a protein of 585 amino acids in its mature form 
(as shown in Figure 15 or in SEQTD NO: 18), is responsible for a significant proportion of 

30 the osmotic pressure of serum and also functions as a carrier of endogenous and exogenous 
Hgands. At present, HA for clinical use is produced by extraction from human blood. The 
production of recombinant HA (rHA) in microorganisms has been disclosed in EP 330 451 
and EP 361 991. 

The role of albumin as a carrier molecule and its inert nature are desirable properties 
35 for use as a carrier and transporter of polypeptides in vivo. The use of albumin as a " 
component of an albumin fusion protein as a carrier for various proteins has been suggested 
in WO 93/15199, WO 93/15200, and EP 413 622. The use of N-terminal fragments of HA 

1 
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Therapeutic protein may be achieved by genetic manipulation, such that the DNA coding for 
HA, or a fragment thereof, is joined to the DNA coding for the Therapeutic protein. A 
suitable host is then transformed or transfected with the fused nucleotide sequences, so 
arranged on a suitable plasmid as to express a fusion polypeptide. The expression may be 
5 effected in vitro from, for example, prokaryotic or eukaryotic cells, or in vivo e.g. from a 
transgenic organism. 

Therapeutic proteins in their native state or when recombinantly produced, such as 
interferons and growth hormones, are typically labile molecules exhibiting short shelf-lives, 
particularly when formulated in aqueous solutions. The instability in these molecules when 

10 formulated for administration dictates that many of the molecules must be lyophilized and 
refrigerated at all times during storage, thereby rendering the molecules difficult to transport 
and/or store. Storage problems are particularly acute when pharmaceutical formulations must 
be stored and dispensed outside of the hospital environment Many protein and peptide drugs 
also require the addition of high concentrations of other protein such as albumin to reduce or 

15 prevent loss of protein due to binding to the container. This is a major concern with respect to 
proteins such as IFN. For this reason, many Therapeutic proteins are formulated in 
combination with large proportion of albumin carrier molecule (100-1000 fold excess), 
though this is an undesirable and expensive feature of the formulation. . 

Few practical solutions to the storage .problems of labile protein molecules have been 

20 proposed. Accordingly, there is a need for stabilized, long lasting formulations of 
proteinaceous therapeutic molecules that are easily dispensed, preferably with a simple 
formulation requiring minimal post-storage manipulation. 

SUMMARY OF THE INVENTION 
. 25 The present invention is based, in part, on the discovery that Therapeutic proteins may 

be stabilized to extend the shelf-life, and/or to retain the Therapeutic protein's activity for 
extended periods of time in solution, in vitro and/or in vivo, by genetically or chemically 
fusing or conjugating the Therapeutic protein to albumin or a fragment (portion) or variant of 
albumin, that is sufficient to stabilize the protein and/or its activity. In addition it has been 
30 determined that the use of albumin-fusion proteins or albumin conjugated proteins may reduce 
the need to formulate protein solutions with large excesses of carrier proteins . (such as 
albumin, unfused) to prevent loss of Therapeutic proteins due to factors such as binding to the 
container. 

The present invention encompasses albumin fusion proteins comprising a Therapeutic 
35 protein (e.g., a polypeptide, antibody, or peptide, or fragments and variants thereof) fused to 
, albumin or a fragment (portion) or variant of albumin.. The present invention also, 
encompasses albumin fusion proteins comprising a Therapeutic protein (e.g., a polypeptide, 
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antibody, or peptide, or fragments and variants thereof) fused to albumin or a fragment 
(portion) or variant of albumin, that is sufficient to prolong the shelf life of the Therapeutic 
protein, and/or stabilize the Therapeutic protein and/or its activity in solution (or in a 
pharmaceutical composition) in vitro and/or in vivo. Nucleic acid molecules encoding the 

5 albumin fusion proteins of the invention are also encompassed by the invention, as are vectors 
containing these nucleic acids, host cells transformed with these nucleic acids, vectors, and 
methods of making the albumin fusion proteins of the invention and using these nucleic acids, 
vectors, and/or host cells. 

The invention also encompasses pharmaceutical formulations comprising an albumin 

10 fusion protein of the invention and a pharmaceutically acceptable diluent or carrier. Such 
formulations .may be in a kit or container. Such kit or container may be packaged with 
instructions pertaining to the extended shelf life of the Therapeutic protein. Such formulations 
may be used in methods of treating, preventing, ameliotationg or diagnosing a disease or 
disease symptom in a patient, preferably a mammal, most preferably a human, comprising the 

1 5 step of administering the pharmaceutical formulation to the patient 

In other embodiments, the present invention encompasses methods of preventing 
treating, or ameliorating a disease or disorder. In preferred embodiments, the present 
invention encompasses, a method of treating a disease or disorder listed in the "Preferred 
Indication Y" column of Table 1 comprising administering to a patient in which such 

20 treatment, prevention or amelioration is desired an albumin fusion protein of the invention that 
comprises a Therapeutic protein portion corresponding to a Therapeutic protein (or fragment 
or variant thereof) disclosed in the 'Therapeutic Protein X" column of Table 1 (in the same 
row as the disease or disorder to be treated is listed in the "Preferred Indication Y" column of , 
Table 1) in an amount effective to treat prevent or ameliorate the disease or disorder. 

25 In another embodiment, the invention includes a method of extending the shelf life of 

a Therapeutic protein (e.g., a polypeptide, antibody, or peptide, or fragments and variants 
thereof) comprising the step of fusing or conjugating the Therapeutic protein to albumin or a 
fragment (portion) or variant of albumin, that is sufficient to extend the shelf-life of the 
Therapeutic protein. In a preferred embodiment, the Therapeutic, protein used according to 

30 this method is fused to the albumin, or the fragment or variant of albumin. In a most * 
preferred embodiment, the Therapeutic protein used according to this method is fused to 
albumin, or a fragment or variant of albumin, via recombinant DNA technology or genetic 
engineering. 

In another embodiment the invention includes a method of stabilizing a Therapeutic 
35 protein (e.g., a polypeptide, antibody, or peptide, or fragments and variants thereof) in 
solution, comprising the step of fusing or conjugating the Therapeutic protein to albumin or a 
fragment (portion) or variant of albumin, that is sufficient to stabilize the Therapeutic protein. 
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In a preferred embodiment/the Therapeutic protein used according to this method is fused to . 
the albumin, or the fragment or variant of albumin. In a most preferred embodiment, the 
Therapeutic protein used according to this method is fused to albumin, or a fragment or 
variant of albumin, via recombinant DNA technology or genetic engineering. 
5 The present invention further includes transgenic organisms modified to contain the 

nucleic acid molecules of the invention, preferably modified to express the albumin fusion 
proteins encoded by the nucleic acid molecules. 

BRIEF DESCRIPTION OF THE FIGURES 
10 Figure 1 depicts the extended shelf-life of an HA fusion protein in terms of the 

biological activity (Nb2 cell proliferation) of HA-hGH remaining after incubation in cell 
culture media for up to 5 weeks at 37°C. Under these conditions, hGH has no observed 

u activity by week 2. - 

Figure 2 depicts the extended shelf-life of an HA fusion protein in terms of the stable 
15 biological activity (Nb2 cell proliferation) of HA-hGH remaining after incubation in cell 
culture media for up to 3 weeks at 4, 37, or 50°C. Data is noimalized to the biological 

activity of hGH at time zero. 

Figures 3A and 3B compare the biological activity of HA-hGH with hGH in the Nb2 
cell proliferation assay. Figure 3A shows proliferation after 24 hours of incubation with 
20 various concentrations, of hGH or the albumin fusion protein, and Figure 3B shows 
proliferation after 48 hours of incubation with various concentrations of hGH or the albumin 
fusion protein. 

Figure 4 shows a map of a plasmid (pPPCOOOS) that can be used as the base vector 
into which polynucleotides encoding the Therapeutic proteins (including polypeptides and 
25 fragments and variants thereof) may be cloned to form HA-fusions. Plasmid Map key: 
PRBlp: PRB1 S. cerevisiae promoter; FL: Fusion leader sequence; rHA: cDNA encoding 
HA: ADHlt: ADH1 S cerevisiae terminator; T3: T3 sequencing primer site; T7: T7 
sequencing primer site; Amp R: ^-lactamase gene; ori: origin of replication. Please note that 
in the provisional applications to which this application claims priority, the plasmid in Figure 
30 4 was labeled pPPC0006, instead of pPPCOOOS. In addition the drawing of this plasmid did 
not show certain pertinent restriction sites in this vector. Thus in the present application, the 
drawing is labeled pPPCOOOS and more .restriction sites of the same vector are shown. 

Figure 5 compares the recovery of vial-stored HA-IFN solutions of various 
concentrations with a stock solution after 48 or 72 hours of storage. 
35 Figure 6 compares the activity, of an HA-cc-IFN fusion protein after administration to 

monkeys via IV or SC. 
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Figure 7 describes the bioavailability and stability of an HA-a-IFN fusion protein. 

Figure 8 is a map of an expression vector for the production of HA-a-IFN. 

Figure 9 shows the location of loops in HA. 
Figure 10 is an example of the modification of an HA loop. 
5 Figure 1 1 is a representation of the HA loops. 

Figure 12 shows the HA loop IV. 
Figure 13 shows the tertiary structure of HA. 
Figure 14 shows an example of a scFy-HA fusion 

Figure 15 shows the amino acid sequence of the mature form of human albumin (SEQ 
10 ID NO: 18) and a polynucleotide encoding it (SEQ ID NO: 17). 

DETAILED DESCRIPTION 

As described above, the present invention is based, in part, on the discovery that a 
Therapeutic protein (e.g., a polypeptide, antibody, or peptide, or fragments and variants 
15 thereof) may be stabilized to extend the shelf-life and/or retain the Therapeutic protein's 
activity for extended periods of time in solution (or in a pharmaceutical composition) in vitro 
and/or in vivo, by genetically fusing or chemically conjugating the Therapeutic protein, 
polypeptide or peptide to all or a portion of albumin sufficient to stabilize the protein and its 
activity. 

20 The present invention relates generally to albumin fusion proteins and methods of 

treating, preventing, or ameliorating diseases or disorders. As used herein, "albumin fusion 
protein" refers to a protein formed by the fusion of at least one molecule of albumin (or a 
fragment or variant thereof) to at least one molecule of a Therapeutic protein (or fragment or 
variant thereof). An albumin fusion protein of the invention comprises at least a fragment or 

25 variant of a Therapeutic protein and at least a fragment or variant of human serum albumin, 
which are associated with one'another, preferably by genetic fusion (i.e., the albumin fusion 
protein is generated by translation of a nucleic acid in which a polynucleotide encoding all or a 
portion of a Therapeutic protein is joined in-frame with a polynucleotide encoding all or a 
portion of albumin) or chemical conjugation to one another. The Therapeutic protein and 

30 albumin protein, once part of the albumin fusion protein, may be. referred to as a "portion", 
"region" or "moiety" of the albumin fusion protein (e.g., a 'Therapeutic protein portion" or an 
"albumin protein portion"). 

In one embodiment, the invention provides an albumin fusion protein comprising, or 
alternatively consisting of, a Therapeutic protein (e.g., as described in Table 1) and a serum 

35 albumin protein. In other embodiments, the invention provides an albumin fusion protein 
comprising, or alternatively consisting of, a biplogically active and/or therapeutically active 
fragment of a Therapeutic protein and a serum albumin protein. In other embodiments, the 
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invention provides an albumin fusion protein comprising, or alternatively consisting of, a 
biologically active and/or therapeutically active variant of a Therapeutic protein and a serum 
albumin protein. In preferred embodiments, the serum albumin protein component of the 
albumin fusion protein is the mature portion of serum albumin. 

5 In further embodiments, the invention provides an albumin fusion protein comprising, 

or alternatively consisting of, a Therapeutic protein, and a biologically active and/or 
therapeutically active fragment of serum albumin. In further embodiments, the invention 
provides an albumin fusion protein comprising, or alternatively consisting of, a Therapeutic 
protein and a biologically active and/or therapeutically active variant of serum albumin. In 

10 preferred embodiments, the Therapeutic protein portion of the albumin fusion protein is the 
mature portion of the Therapeutic protein. In a further preferred embodiment, the Therapeutic 
protein portion of the albumin fusion protein is the extracellular soluble domain of the 
Therapeutic protein. In an alternative embodiment, the Therapeutic protein portion of the 
albumin fusion protein is the active form of the Therapeutic protien. 

1 5 in further embodiments, the invention provides an albumin fusion protein comprising, 

or alternatively consisting of, a biologically active and/or therapeutically active fragment or 
variant of a Therapeutic protein and a biologically active and/or therapeutically active fragment 
or variant of serum albumin. In preferred embodiments, the invention provides an albumin 
fusion protein comprising, or alternatively consisting of, the mature portion of a Therapeutic 

20 protein and the mature portion of serum albumin. 

Therapeutic proteins 

As stated above, an albumin fusion protein of the invention comprises at least a 
fragment or variant of a Therapeutic protein and at least a fragment or variant of human serum 
25 albumin, which are associated with one another, preferably by genetic fusion or chemical 
conjugation. 

As used herein, 'Therapeutic protein" refers to proteins, polypeptides, antibodies, 
peptides or fragments or variants thereof, having one or more therapeutic and/or biological 
activities. Therapeutic proteins encompassed by the invention include but are not limited to, 

30 proteins, polypeptides, peptides, antibodies, and biologies. (The terms peptides, proteins, 
and polypeptides are used interchangeably herein.) It is specifically contemplated that the 
term 'Therapeutic protein" encompasses antibodies and fragments and variants thereof. Thus 
an albumin fusion protein of the invention may contain at least a fragment or variant of a 
Therapeutic protein, and/or at least a fragment or variant of an antibody. Additionally, the 

35 term 'Therapeutic protein" may refer to the endogenous or naturally occurring correlate of a 
Therapeutic protein. 

By a polypeptide displaying a "therapeutic activity" or a protein that is "therapeutically 
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active" is meant a polypeptide that possesses one or more known biological and/or therapeutic 
activities associated with a Therapeutic protein such as one or more of the Therapeutic 
proteins described herein or otherwise known in the art. As a non-limiting example, a 
'Therapeutic protein" is a protein that is useful to treat, prevent or ameliorate a disease, 
5 condition or disorder. As a non-limiting example, a ^Therapeutic protein" may be one that 
binds specifically to a particular cell type (normal (e.g., lymphocytes) or abnormal e.g., 
(cancer cells)) and therefore may be used to target a compound (drug, or cytotoxic agent) to. 
that cell type specifically. 

In another non-limiting example, a 'Therapeutic protein" is a protein that has a 
10 biological activity, and in particular, a biological activity that is useful for treating preventing 
or ameliorating a disease. A non-inclusive list of biological activities that may be possessed by 
a Therapeutic protein includes, enhancing the immune response, promoting angiogenesis, 
inhibiting angiogenesis, regulating hematopoietic functions, stimulating nerve growth, 
enhancing an immune response, inhibiting an immune response, or any one or more of the 
15 biological activities described in the "Biological Activities" section below. 

As used herein, "therapeutic activity" or "activity" may refer to an activity whose 
effect is consistent with a desirable therapeutic outcome in humans, or to desired effects in 
non-human mammals or in other species or organisms. Therapeutic activity may be measured 
* in vivo or in vitro. For example, a desirable effect may be assayed in cell culture. As an 
20 . example, when hGH is the Therapeutic protein, the effects of hGH on cell proliferation as 
described in Example 1 may be used as the endpoint for which therapeutic activity is 
measured. Such in vitro or cell culture assays are commonly available for many Therapeutic 
proteins as described in the art. 

Examples of useful assays for particular Therapeutic proteins include, but are not 
. 25 limited to, GMCSF (Eaves, A.C. and Eaves C.J., Erythropoiesis in culture. In: McCullock 
EA (edt) Cell culture techniques - Clinics in hematology. WB Saunders, Eastbourne, pp 371- 
91 (1984); Metcalf, D., International Journal of Cell Cloning 10: 116-25 (1992); Testa, 
N.G., et al., Assays for hematopoietic growth factors. In: Balkwill FR (edt) Cytokines, A 
practical Approach, pp 229-44; IRL Press Oxford 1991) EPO (bioassay: Kitamura et al, J . 
30 Cell. Physiol. 140 p323 (1989)); Hirudin (platelet aggregation assay: Blood Coagul 
Fibrinolysis 7(2):259-61 (1996)); IFNa (anti-viral assay: Rubinstein et al., J. Virol. 
37(2):755-8 (1981); anti-proliferative assay: Gao Y, et al Mol Cell Biol. 19(11):7305-13 
(1999); and bioassay: Czarniecki etal, J. Virol. 49 p490 (1984)); GCSF (bioassay: Shirafuji 
et aL, Exp. HematoL 17 pi 16 (1989); proliferation of murine NFS-60 cells (Weinstein et al, 
35 Proc Natl Acad Sci 83:5010-4 (1986)); insulin ( 3 H-glucose uptake assay: Steppan^et al., 
Nature 409(6818):307-12 (2001)); hGH (Ba/F3-hGHR proliferation assay: J Clin Endocrinol 
Metab 85(ll):4274-9 (2000); International standard for growth hormone: Horm Res, .51 
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Suppl 1:7-12 (1999)); factor X (factor X activity assay: Van Wijk et &L Thromb Res 22:681- 
686 (1981)); factor VII (coagulation assay using prothrombin clotting time: Belaaouaj et al., 
J. Biol. Chem. 275:27123-8(2000); Diaz-Collier et aL, Thromb Haemost 71:339-46 (1994)), 
or as shown in Table 1 in the "Exemplary Activity Assay" column. 
5 Therapeutic proteins corresponding to a Therapeutic protein portion of an albumin 

fusion protein of the invention, such as cell surface and secretory proteins, are often modified 
by the attachment of one or more oligosaccharide groups. The modification, referred to as 
glycosylation, can dramatically affect the physical properties of proteins and can be important 
in protein stability, secretion, and localization. Glycosylation occurs at specific locations 

10 along the polypeptide backbone. There are usually two major types of glycosylation: 
glycosylation characterized by O-linked oligosaccharides, which are attached to serine or 
threonine residues; and glycosylation characterized by N-linked oligosaccharides, which are 
attached to asparagine residues in an Asn-X-Ser/Thr sequence, where X can be any amino 
acid except proline. N-acetylneuramic acid (also known as sialic acid) is usually the terminal 

15 residue of both N-linked and 0-linked oligosaccharides. Variables such as protein structure 
and cell type influence the number and nature of the carbohydrate units within the chains at 
different glycosylation sites. Glycosylation isomers are also common at the same site within a 
given cell type. 

For example, several types of human interferon are glycosylated. Natural human 
20 interferon-a2 is O-glycosylated at threonine 106, and N-glycosylation occurs at asparagine 72 
in interferon-al4 (Adolf et aL, J. Biochem 276:511 (1991); Nyman TA et aL, J. Biochem 
329:295 (1998)). The oligosaccharides at asparagine 80 in natural interferon-^la may play 
an important factor in the solubility and stability of the protein, but may not be essential for its 
biological activity. This permits the production of an unglycosylated analog (interferon- p lb) 
25 engineered with sequence modifications to enhance stability (Hosoi et aL, J. Interferon Res. 
8:375 (1988; Karpusas et aL, Cell Mol Life Sci 54:1203 (1998); Knight, J. Interferon Res. 
2:421 (1982); Runkel et aL, Pharm Res 15:641 (1998); Lin, Dev. Biol. Stand. 96:97 
(1998))1. Interferon-^ contains two N-linked oligosaccharide chains at positions 25 and 97* 
both important for the efficient formation of the bioactive recombinant protein, and having an . , 
30 influence on the pharmacokinetic properties of the protein (Sareneva et aL, Eur. J. Biochem 
242:191 (1996); Sareneva et al,. Biochem J. 303:831 (1994); Sareneva et aL, J. Interferon 
Res. 13:267 (1993)). Mixed O-linked and N-linked glycosylation also occurs, for example in 
human erythropoietin, N-linked glycosylation occurs at asparagine residues located at 
positions 24, 38 and 83 while O-linked glycosylation occurs at a serine residue located at 
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position 126 (Lai et aL, J. Biol. Chem. 261:3116 (1986); Broudy et aL, Arch. Biochem. 
Biophys. 265:329 (1988)). 

Therapeutic proteins corresponding to a Therapeutic protein portion of an albumin 
fusion protein of the invention, as well as analogs and variants thereof, may be modified so 
5 that glycosylation at one or more sites is altered as a result of manipulation(s) of their nucleic 
acid sequence, by the host cell in which they are expressed, or due to other conditions of their 
expression. For example, glycosylation isomers may be produced by abolishing or 
introducing glycosylation sites, e.g. r by substitution or deletion of amino acid residues, such 
as substitution of glutamine for asparagine, or unglycosylated recombinant proteins may be 
10 produced by expressing the proteins in host cells that will not glycosylate them, e.g. in E. coli 
or glycosylation-deficient yeast. These approaches are described in more detail below and are 
known in the art. 

Therapeutic proteins corresponding to a Therapeutic protein portion of an albumin 
fusion protein of the invention include, but are not limited to, plasma proteins. More 

15 specifically, such Therapeutic proteins include, but are not limited to, immunoglobulins, 
serum cholinesterase, alpha- 1 antitrypsin, aprotinin, coagulation factors in both pre and active 
forms including but not limited to, von Willebrand factor, fibrinogen, factor II, factor VII, 
factor VIIA activated factor, factor VIII; factor IX, factor X, factor XIII, cl inactivator, 
antithrombin III, thrombin, prothrombin, apo-lipoprotein, c-reactive protein, and protein C. 

20 Therapeutic proteins corresponding to a Therapeutic protein portion of an albumin fusion 
protein of the invention further include, but are not limited to, human growth hormone 

(hGH), a-interferon, erythropoietin (EPO), granulocyte-colony stimulating factor (GCSF), 

granulocyte-macrophage colony-stimulating factor (GMCSF), insulin, single chain 
antibodies, autocrine motility factor, scatter factor, laminin, hirudin, applaggin, monocyte 

25 chemotactic protein (MCP/MCAF), macrophage colony-stimulating ^factor (M-CSF), 
osteopontin, platelet factor 4, tenascin, vitronectin^ in addition to those described in Table 1 . 
These proteins and nucleic acid sequences encoding these proteins are well known and 
available in public databases such as Chemical Abstracts Services Databases (e.g., the CAS 
Registry), GenBank, and GenSeq as shown in Table 1. . 

30 Additional Therapeutic proteins corresponding to a Therapeutic protein portion of an 

albumin fusion protein of the invention include, but are not limited to, one or more of the 
Therapeutic proteins or peptides disclosed in the 'Therapeutic Protein X" column of Table 1 , 
or fragment or variable thereof . 

Table 1 provides a non-exhaustive list of Therapeutic proteins that correspond to a 

35 Therapeutic protein portion of an albumin fusion protein of the invention. The 'Therapeutic 
Protein X" column discloses Therapeutic protein molecules followed by parentheses 
containing scientific and brand names that comprise, or alternatively consist of, that 
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Therapeutic protein molecule or a fragment or variant thereof. 'Therapeutic protein X" as 
used herein may refer either to an individual Therapeutic protein molecule (as defined by the 
amino acid sequence obtainable from the CAS and Genbank accession numbers), or to the 
entire group of Therapeutic proteins associated with a given Therapeutic protein molecule 
5 disclosed in this column. The "Exemplary Identifier" column provides Chemical Abstracts 
Services (CAS) Registry Numbers (published by the American Chemical Society) and/or 
Genbank Accession Numbers ((e.g., Locus ID, NP_XXXXX (Reference Sequence Protein), 
and XP_XXXXX (Model Protein) identifiers available through the national Center for 
Biotechnology Information (NCBI) webpage at www.ncbi.nlm.nih.gov) that correspond to 
10 entries in the CAS Registry or Genbank database which contain an amino acid sequence of the 
Therapeutic Protein Molecule or of a fragment or variant of the Therapeutic Protein Molecule. 
The summary pages associated with each of these CAS and Genbank Accession Numbers are 
each incorporated by reference in their entireties, particularly with respect to the amino acid 
; sequences described therein. The "PCT/Patent Reference" column provides U.S. Patent 

15 numbers, or PCT International Publication Numbers corresponding to patents and/or 
published patent applications that describe the Therapeutic protein molecule. Each of the 
patents and/or published patent applications cited in the "PCT/Patent Reference" column are 
herein incorporated by reference in their entireties. In particular, the amino acid sequences of 
the specified polypeptide set forth in the sequence listing of each cited "PCT/Patent 
20 Reference", the variants of these amino acid sequences (mutations, fragments, etc.) set forth, 
for example, in the detailed description of each cited "PCT/Patent Reference", the therapeutic 
indications set forth, for example, in the detailed description of each cited "PCT/Patent 
Reference", and the activity asssaysfor the . specified polypeptide set forth in the detailed 
description, and more particularly, the examples of each cited "PCT/Patent Reference" are 
25 incorporated herein by reference. The "Biological activity" column describes Biological 
activities associated with the Therapeutic protein molecule. The "Exemplary Activity Assay" 
column provides references that describe assays which may be used to test the therapeutic 
and/or biological activity, of a Therapeutic protein or an albumin fusion protein of the 
invention comprising a Therapeutic protein X portion. Each of the references cited in the 
30 "Exemplary Activity Assay" column are herein incorporated by reference in their entireties, 
particularly with respect to the description of the respective activity assay described in the 
reference (see Methods section, for example) for assaying the corresponding biological 
activity set forth in the "Biological Activity" column of Table 1. The "Preferred Indication Y" 
column describes disease, disorders, and/or conditions that may be treated, prevented, 
35 diagnosed, or ameliorated by Therapeutic protein X or an albumin fusion protein of the 
invention comprising a'Therapeutic protein X portion. 
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In preferred embodiments, the albumin fusion proteins of the invention are capable of 
a therapeutic activity and/or biologic activity corresponding to the therapeutic activity and/or 
biologic activity of the Therapeutic protein corresponding to the Therapeutic protein portion of 
the albumin fusion protein listed in the corresponding row of Table 1. (See, e.g., the 
"Biological Activity" and 'Therapeutic Protein X"columns of Table 1.) In further preferred 
embodiments, the therapeutically active protein portions of the albumin fusion proteins of the 
invention are fragments or variants of the reference sequence cited in the "Exemplary 
Identifier" column of Table 1, and are capable of the therapeutic activity and/or biologic 
activity of the corresponding Therapeutic protein disclosed in "Biological Activity" column of 
Table 1. 

Polypeptide and Polynucleotide Fragments and Variants 

Fragments 

The present invention is further directed to fragments of the Therapeutic proteins 
described in Table 1, albumin proteins, and/or albumin fusion proteins of the invention. 

Even if deletion of one or more amino acids from the N-terminus of a protein results 
in modification or loss of one or more biological functions of the Therapeutic protein, albumin 
protein, and/or albumin fusion protein, other Therapeutic activities and/or functional activities 
(e.g., biological activities, ability to multimerize, ability to bind a ligand) may still be retained. 
For example, the ability of polypeptides with N-terminal deletions to induce and/or bind to 
antibodies, which recognize the complete or mature forms of the polypeptides generally will be 
retained when less than the majority of the residues of the complete polypeptide are removed 
from the N-terminus. Whether a particular polypeptide lacking N-terminal residues of a 
complete polypeptide retains such immunologic activities can readily be determined by routine 
methods described herein and otherwise known in the art. It is not unlikely that a mutein with 
a large number of deleted N-terminal amino acid residues may retain some biological or 
immunogenic activities. In fact, peptides composed of as few as six amino acid residues may 
often evoke an immune response. 

Accordingly, fragments of a Therapeutic protein corresponding to a Therapeutic 
protein portion of an albumin fusion protein of the invention, include the full length protein as 
well, as polypeptides having one or more residues deleted from the amino terminus of the 
amino acid sequence of the reference polypeptide (e.g., a Therapeutic protein as disclosed in 
Table 1). In particular, N-terminal deletions may be described by the general formula m-q, 
where q is a whole integer representing the total number of amino acid residues in a reference 
polypeptide (e.g., a Therapeutic protein referred to in Table 1), and m is defined as any 
integer ranging from 2 to q-6. Polynucleotides encoding these polypeptides are also 
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encompassed by the invention. 

In addition, fragments of serum albumin polypeptides corresponding to an albumin 
protein portion of an albumin fusion protein of the invention, include the full length protein as 
well as polypeptides having one or more residues deleted from the amino terminus ,of the 
amino acid sequence of the reference polypeptide (i.e., serum albumin). In particular, N- 
terminal deletions may be described by the general formula m-585, where 585 is a whole 
integer representing the total number of amino acid residues in serum albumin (SEQ ID 
NO: 18), and m is defined as any integer ranging from 2 to 579. Polynucleotides encoding 
these polypeptides are also encompassed by the invention. 

. Moreover, fragments of albumin fusion proteins of the invention, include the full 
length albumin fusion protein as well as polypeptides having one or more residues deleted 
from the amino terminus of the albumin fusion protein. In particular, N-terminal deletions 
may be described by the general formula m-q, where q is a whole integer representing the 
total number of amino acid residues in the albumin fusion protein, and m is defined as any 
integer ranging from 2 to q-6. Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

Also as mentioned above, even if deletion of one or more amino acids from the N- 
terminus or C-terrninus of a reference polypeptide (e.g., a Therapeutic protein and/or serum 
albumin protein) results in modification or loss of one or more biological functions of the 
protein, other functional activities (e.g., biological activities, ability to multimerize, ability to 
bind a ligand) and/or Therapeutic activities may still be retained. For example the ability of 
polypeptides with C-terminal deletions to induce and/or bind to antibodies which recognize 
the complete or mature forms of the polypeptide generally will be retained when less than the 
majority of the residues of the complete or mature polypeptide are removed from the 
C-terminus. Whether a particular polypeptide lacking the N-temiinal and/or C-terminal 
residues of a reference polypeptide retains Therapeutic activity can readily be determined by 
routine methods described herein and/or otherwise known in the art. 

The present invention further provides polypeptides having one or more residues 
deleted from the carboxy terminus of the amino acid sequence of a Therapeutic protein 
corresponding to a Therapeutic protein portion of an albumin fusion protein of the invention 
(e.g., a Therapeutic protein referred to in Table 1). In particular, C-terminal deletions may be 
described by the general formula 1-n, where n is any whole integer ranging from 6 to q-1, 
and where q is a whole integer representing the total number of amino acid residues in a 
reference polypeptide (e.g., a Therapeutic protein referred to in Table 1). Polynucleotides 
encoding these polypeptides are also encompassed by the invention. 

In addition, the present invention provides polypeptides having one or more residues 
deleted from the carboxy terminus of the amino acid sequence of an albumin protein 
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corresponding to an albumin protein portion of an albumin fusion protein of the invention 
(e.g., serum albumin). In particular, C-terminal deletions may be. described by the general 
formula 1-n, where n is any whole integer ranging from. 6 to 584, where 584 is the whole 
integer representing the total number of amino acid residues in serum albumin (SEQ ID 
5 NO: 18) minus 1. Polynucleotides encoding these polypeptides are also encompassed by the 
invention. 

Moreover, the present invention provides polypeptides having one or more residues 
deleted from the carboxy terminus of an albumin fusion protein of the invention. In 
' particular, C-terminal deletions may be described by the general formula 1-n, where n is any 

10 whole integer ranging from 6 to q-1, and where q is a whole integer representing the total 
number of amino acid residues in an albumin fusion protein of the invention. Polynucleotides 
encoding these polypeptides are also encompassed by the invention. 

In addition, any of the above described N- or C-terminal deletions can be combined to 
produce a N- and C-terminal deleted reference polypeptide. The invention also provides 

15 polypeptides having one or more amino acids deleted from both the amino and the carboxyl 
termini, which may be described generally as having residues m-n of a reference polypeptide 
(e.g., a Therapeutic protein referred to in Table 1, or serum albumin (e.g., SEQ ID NO: 18), 
or an albumin fusion protein of the invention) where n and m are integers as described above. 
Polynucleotides encoding these polypeptides are also encompassed by the invention. 

20 The present application is also directed to proteins containing polypeptides at least 

80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% identical to a reference polypeptide 
sequence (e.g., a Therapeutic protein, serum albumin protein or an albumin fusion protein of 
the invention) set forth herein, or fragments thereof. In preferred embodiments, the 
application is directed to proteins comprising polypeptides at least 80%, 85%, 90%, 95%,. 

25 96%, 97%, 98% or 99% identical to reference polypeptides having the amino acid sequence 
of N- and C-terminal deletions as described above. Polynucleotides encoding these 
polypeptides are also encompassed by the invention. 

Preferred polypeptide fragments of the invention are fragments comprising, or 
alternatively, consisting of , an amino acid sequence that displays a Therapeutic activity and/or 

30 functional activity (e.g. biological activity) of the polypeptide sequence of the Therapeutic 
\ protein or serum albumin protein of which the amino acid sequencers a fragment. 

Other preferred polypeptide fragments are biologically active fragments. Biologically active 
fragments are those exhibiting activity similar, but not necessarily identical, to an activity of 
the polypeptide of the present invention. The biological activity of the fragments may include 

35 an improved desired activity, or a decreased undesirable activity. 

Variants 
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"Variant" refers to a polynucleotide or nucleic acid differing from a reference nucleic 
acid or polypeptide, but retaining essential properties thereof. Generally, variants are overall 
closely similar, and, in many regions, identical to the reference nucleic acid or polypeptide. 

As used herein, "variant", refers to a Therapeutic protein portion of an albumin fusion 
5 protein of the invention, albumin portion of an albumin fusion protein of the invention, or 
albumin fusion protein differing in sequence from a Therapeutic protein (e.g. see 
"therapeutic" column of Table 1), albumin protein, and/or albumin fusion protein of the 
invention, respectively, but retaining at least one functional and/or therapeutic property 
thereof (e.g., a therapeutic activity and/or biological activity as disclosed in the "Biological 

10 Activity" column of Table 1) as described elsewhere herein or otherwise known in the art. 
Generally, variants are overall very similar, and, in many regions, identical to the amino acid 
sequence of the Therapeutic protein corresponding to a Therapeutic protein portion of an 
albumin fusion protein of the invention, albumin protein corresponding to an albumin protein 
portion of an albumin fusion protein of the invention, and/or albumin fusion protein of the 

15 invention. Nucleic acids encoding these variants are also encompassed by the invention. 

The present invention is also directed to proteins which comprise, or alternatively 
consist of, an amino acid sequence which is at least 80%, 85%, 90%, 95%, 96%, 97%, 
98%, 99% or 100%, identical to, for example, the amino acid sequence of a Therapeutic 
protein corresponding to a Therapeutic protein portion of an albumin fusion protein of the 

20 invention (e.g., an amino acid sequence disclosed in the "Exemplary Identifier" column of 
Table 1, or fragments or variants thereof), albumin proteins (e.g., SEQ ID NO: 18 or 
fragments or variants thereof) corresponding to an albumin protein portion of an albumin 
fusion protein of the invention, and/or albumin fusion proteins of the invention. Fragments 
of these polypeptides are also provided (e.g., those fragments described herein). Further 

25 polypeptides encompassed by the invention are polypeptides encoded by polynucleotides 
which hybridize to the complement of a nucleic acid molecule encoding an amino acid 
sequence of the invention under stringent hybridization conditions (e.g., hybridization to filter 
bound DNA in 6X Sodium chloride/Sodium citrate (SSC) at about 45 degrees Celsius, 
followed by one or more washes in 0.2X SSC, 0.1% SDS at about 50 - 65 degrees Celsius), 

30 under highly stringent conditions (e.g., hybridization to filter bound DNA in 6X sodium 
chloride/Sodium citrate (SSC) at about 45 degrees Celsius, followed by one or more washes 
in 0.1X SSC, 0.2% SDS at about 68 degrees Celsius), or under other stringent hybridization 
conditions which are known to those of skill in the art (see, for example, Ausubel, F.M. et 
aL, eds:, 1989 Current protocol in Molecular Biology, Green publishing associates, Inc., and 

35 John Wiley & Sons Inc., New York, at pages 6.3.1 - 6.3.6 and 2.10.3). Polynucleotides 
encoding these polypeptides are also encompassed by the invention. 

By a polypeptide having an amino acid sequence at least, for example, 95% "identical" 

40 



WO 01/79271 



PCT/US01/12009 



to a query amino acid sequence of the present invention, it is intended that the amino acid 
sequence of the subject polypeptide is identical to the query sequence except that the subject 
polypeptide sequence may include up to five amino acid alterations per each 100 amino acids 
of the query amino acid sequence. In other words, to obtain a polypeptide having an amino 
5 acid sequence at least 95% identical to a query amino acid sequence, up to 5% of the amino 
acid residues in the subject sequence may be inserted, deleted, or substituted with another 
amino acid. These alterations of the reference sequence may occur at the amino- or carboxy- 
terminal positions of the reference amino acid sequence or anywhere between those terminal 
positions, interspersed either individually among residues in the reference sequence or in one 

10 or more contiguous groups within the reference sequence. 

As a practical matter, whether any particular polypeptide is at least 80%, 85%, 90%, 
95%, 96%,. 97%, 98% or 99% identical to, for instance, the amino acid sequence, of an 
albumin fusion protein of the invention or a fragment thereof (such as the Therapeutic protein 
portion of the albumin fusion protein or the albumin portion of the albumin fusion protein), 

15 can be determined conventionally using known computer programs. A preferred method for 
determining the best overall match between a query sequence (a sequence of the present 
invention) and a subject sequence, also referred to as a global sequence alignment, can be 
determined using the FASTDB computer program based on the algorithm of Bnitlag et al. 
(Comp. App. Biosci. 6:237-245 (1990)). In a sequence alignment the query and subject 

20 sequences are either both nucleotide sequences or both amino acid sequences. The result of 
said global sequence alignment is expressed as percent identity. Preferred parameters used in 
a FASTDB amino acid alignment are: Matrix=PAM 0, k-tuple=2, Mismatch Penalty=l, 
Joining Penalty=20, Randomization Group Length=0, Cutoff Score=l, Window 
Size=sequence length, Gap Penalty=5, Gap Size Penalty=0.05, Window Size=500 or the 

25 length of the subject amino acid sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence due to N- or C-terminal 
deletions, not because of internal deletions, a manual correction must be made to the results. 
This is because the FASTDB program does not account for N- and C-tenhinal truncations of 
the subject sequence when calculating global percent identity. For subject sequences 

30 truncated at the N- and C-termini, relative to the query sequence, the percent identity is 
corrected by calculating the number of residues of the query sequence that are N- and C- 
terminal of the subject sequence, which are not matched/aligned with a corresponding subject 
residue, as a percent of the total bases of. the query sequence. Whether a residue is 
matched/aligned is determined by results of the FASTDB sequence alignment. This 

35 percentage is then subtracted from the percent identity, calculated by the above FASTDB 
program using the specified parameters, to arrive at a final percent identity score. This final 
^percent identity score is what is used for the purposes of the present invention. Only residues 
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to the N- and C-termini of the subject sequence, which are not matched/ aligned with the query 
sequence, are considered for the purposes of manually adjusting the percent identity score. 
That is, only query residue positions outside the farthest N- and C- terminal residues of the 
subject sequence. 

5 For example, a 90 amino acid residue subject sequence is aligned with a 100 residue 

query sequence to determine percent identity. The deletion occurs at the N-tehninus of the 
subject sequence and therefore, the FASTDB alignment does not show a matching/alignment 
of the first 10 residues at the N-terminus. The 10 unpaired residues represent 10% of the 
sequence (number of residues at the N- and C- termini not matched/total number of residues 

10 in the query sequence) so 10% is subtracted from the percent identity score calculated by the 
FASTDB program. If the remaining 90 residues were perfectly matched the final percent 
identity would be 90%. In another example, a 90 residue subject sequence is compared with 
a 100 residue query sequence. This time the deletions are internal deletions so there are no 
residues at the N- or C-termini of the subject sequence which are not matched/aligned with the 

15 query. In this case the percent identity calculated by FASTDB is not manually corrected. 
Once again, only residue positions outside the N- and C-terminal ends of the subject 
sequence, as displayed in the FASTDB alignment, which are not matched/aligned with the 
query sequence are manually corrected for. No other manual corrections are to made for the 
purposes of the present invention. 

20 The variant will usually have at least 75 % (preferably at least about 80%, 90%, 95% 

or 99%) sequence identity with a length of normal "HA or Therapeutic protein which is the 
same length as the variant. Homology or identity at the nucleotide or amino acid sequence 
level is determined by BLAST (Basic Local Alignment Search Tool) analysis using the 
algorithm employed by the programs blastp, blastn, blastx, tblastn and tblastx (Karlin et aL, 

25 Proc. Natl, Acad. Sci. USA 87: 2264-2268 (1990) and Altschul, J. Mol. Evol. 36: 290-300 
(1993), fully incorporated by reference) which are tailored for sequence similarity searching. 

The approach used by the BLAST program is to first consider similar segments 
between a query sequence and a database sequence, then to evaluate the statistical significance 
of all matches that are identified and finally to summarize only those matches which satisfy a 

30 preselected threshold of significance. For a discussion of basic issues in similarity searching 
of sequence databases, see Altschul etaL, (Nature Genetics 6: 119-129 (1994)) which is fully 
incorporated by reference. The search parameters for histogram, descriptions, alignments, 
expect (i.e., the statistical significance threshold for reporting matches against database 
sequences), cutoff, matrix and filter are at the default settings. The default scoring matrix used 

35 by blastp, blastx, tblastn, and tblastx is the BLOSUM62 matrix (Henikoff et aL, Prbc. Natl. 
Acad. Sci. USA 89: 10915-10919 (1992), fully incorporated by reference). For blastn, the 
scoring, matrix is set by the ratios of M (i.e., the reward score for a pair of matching residues) 
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to N (Le., the penalty score for mismatching residues), wherein the default values for M and 
N are 5 and -4, respectively. Four blastn parameters may be adjusted as follows: Q= 10 (gap 
creation penalty); R=10 (gap extension penalty); wink=l (generates word hits at every wink th 
position along the query); and gapw=16 (sets the window width within which gapped 
5 alignments are generated). The equivalent Blastp parameter settings were Q=9; R=2; wink=l ; 
and gapw=32. A Bestfit comparison between sequences, available in the GCG package 
version 10.0, uses DNA parameters GAP=50 (gap creation penalty) and LEN=3 (gap 
extension penalty) and the equivalent settings in protein comparisons are GAP=8 and LEN=2. 
The polynucleotide variants of the invention may contain alterations in the coding 

10 regions, non-coding regions, or both. Especially preferred are polynucleotide variants 
containing alterations which produce silent substitutions, additions, or deletions, but do not 
alter the properties or activities of the encoded polypeptide. Nucleotide variants produced by 
silent substitutions due to the degeneracy of the genetic code are preferred. Moreover, 
polypeptide variants in which less than 50, less than 40, less than 30, less than 20, less than 

15 10, or 5-50, 5-25, 5-10, 1-5, or 1-2 amino acids are substituted, deleted, or added in . any 
combination are also preferred. Polynucleotide variants can be produced for a variety of 
reasons, e.g., to optimize codon expression for a particular host (change codons in the human 
mRNA to those preferred by a bacterial host, such as, yeast or E. coli). 

In a preferred embodiment, a polynucleotide encoding an albumin portion of an 

20 albumin fusion protein of the invention is optimized for expression in yeast or mammalian 
cells. In further preferred embodiment, a polynucleotide encoding a Therapeutic protein 
portion of an albumin fusion protein of the invention is optimized for expression in yeast or 
mammalian cells. In a still further preferred embodiment, a polynucleotide encoding an 
1 albumin fusion protein of the invention is optimized for expression in yeast or mammalian 

25 cells. 

In an alternative embodiment, a codon optimized polynucleotide encoding a 
Therapeutic protein portion of an albumin fusion protein of the invention does not hybridize to 
the wild type polynucleotide encoding the Therapeutic protein under stringent hybridization 
conditions as described herein. In a further embodiment, a codon optimized polynucleotide 

30 encoding an albumin portion of an albumin fusion protein of the invention does not hybridize 
to the wild type polynucleotide encoding the albumin protein under stringent hybridization 
conditions as described herein. In another embodiment, a codon optimized polynucleotide 
encoding an albumin fusion protein of the invention does not hybridize to the wild type 
polynucleotide encoding the Therapeutic protein portin or the albumin protein portion under 

35 stringent hybridization conditions as described herein. 

■ In an additional embodiment, polynucleotides encoding a Therapeutic protein portion 
of an albumin fusion protein of the invention do not comprise, or alternatively consist of, the 
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naturally occurring sequence of that Therapeutic protein. In a further embodiment, 
polynucleotides encoding an albumin protein portion of an albumin fusion protein of the 
invention do not comprise, or alternatively consist of, the naturally occurring sequence of 
albumin protein. In an alternative embodiment, polynucleotides encoding an albumin fusion 
protein of the invention do not comprise, or alternatively consist of, the naturally occurring 
sequence of a Therapeutic protein portion or the albumin protein portion. 

Naturally occurring variants are called "allelic variants, 7 ' and refer to one of several 
alternate forms of a gene occupying a given locus on a chromosome of an organism. (Genes 
II, Lewin, B:, ed., John Wiley & Sons, New York (1985)). These allelic variants can vary at 
either the polynucleotide and/or polypeptide level and are included in the present invention. 
Alternatively, non-naturally occurring variants may be produced by mutagenesis techniques or 
by direct synthesis. 

Using known methods of protein engineering and recombinant DNA technology, 
variants may be generated to improve or alter the characteristics of the polypeptides of the 
present invention. For instance, one or more amino acids can be deleted from the N-terminus 
or C-terminus of the polypeptide of the present invention without substantial loss of biological 
function. As an example, Ron et al. (J. Biol. Chem. 268: 2984-2988 (1993)) reported variant 
KGF proteins having heparin binding activity even after deleting 3, 8, or 27 amino-terminal 
amino acid residues. Similarly, Interferon gamma exhibited up to ten times higher activity 
after deleting 8-10 amino acid residues from the carboxy terminus of this protein. (Dobeli et 
al., J. Biotechnology 7:199-216 (1988).) 

Moreover, ample evidence demonstrates that variants often retain a biological activity 
similar to that of the naturally occurring protein. For example, Gayle and coworkers (J. Biol. 
Chem. 268:22105-22111 (1993)) conducted extensive mutational analysis of human cytokine 
IL-la. They used random mutagenesis to generate over 3,500 individual IL-la mutants that 
averaged 2.5 amino acid changes per variant over the entire length of the molecule. Multiple 
mutations were examined. at every possible amino acid position. The investigators found that 
"[m]ost of the molecule could be altered with little effect on either [binding or biological 
activity]." In fact, only 23 unique amino acid sequences, out of more than 3,500 nucleotide 
sequences examined, produced a protein that significandy differed in activity from wild-type. 

Furthermore, even if deleting one or more amino acids from the N-terminus or C- 
terminus of a polypeptide results in modification or loss of one or more biological functions, 
other biological activities may still be retained. For example, the ability of a deletion variant to 
induce and/or to bind antibodies which recognize the secreted form will likely be retained 
when less than the majority of the residues of the secreted form are removed from the N- 
terminus or C-terminus. Whether a particular polypeptide lacking N- or C-terminal residues 
of a protein retains such immunogenic activities can readily be determined by routine methods 
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described herein and otherwise known in the art. 

Thus, the invention further includes polypeptide variants which have a functional 
activity (e.g., biological activity and/or therapeutic activity). In highly preferred embodiments 
the invention provides variants of albumin fusion proteins that have afunctional activity (e.g., 
5 biological activity and/pr therapeutic activity, such as that disclosed in the "Biological 
Activity" column in Table 1) that corresponds to one or more biological and/or therapeutic 
activities of the Therapeutic protein corresponding to the Therapeutic protein portion of the 
albumin fusion protein. Such variants include deletions, insertions, inversions, repeats, and 
substitutions selected according to genera] rules known in the art so as have little effect on 
10 activity. 

In preferred embodiments, the variants of the invention have conservative 
substitutions. By "conservative substitutions" is intended swaps within groups such as 
replacement of the aliphatic or hydrophobic amino acids Ala, Val, Leu and He; replacement of 
the hydroxyl residues Ser and Thr; replacement of the acidic residues Asp and Glu; 

15 replacement of the amide residues Asn and Gin, replacement of the basic residues Lys, Arg, 
and His; replacement of the aromatic residues Phe, Tyr, and Trp, and replacement of the 
small-sized amino acids Ala, Ser, Thr, Met, and Gly. 

Guidance concerning how to make phenotypically silent amino acid substitutions is 
provided, for example, in Bowie et aL, "Deciphering the Message in Protein Sequences: 

20 Tolerance to Amino Acid Substitutions," Science 247: 1306-13 10 (1990), wherein the authors 
indicate that there are two main strategies for studying the tolerance of an amino acid sequence 
to change. 

The first strategy exploits the tolerance of amino acid substitutions by natural selection 
during the process of evolution. By comparing amino acid sequences in different species, 

25 conserved amino acids can be identified. These conserved amino acids are likely important 
for protein^ function. In contrast, the amino acid positions where substitutions have been 
tolerated by natural selection indicates that these positions are not critical for protein function. 
Thus, positions tolerating amino acid substitution could be modified while still maintaining 
biological activity of the protein. 

30 The second strategy uses genetic engineering to introduce amino acid changes at 

specific positions of a cloned gene to identify regions critical for protein function. For 
example, site directed mutagenesis or alanine-scanning mutagenesis (introduction of single 
alanine mutations at every residue in the molecule) can be used. See Cunningham and Wells, 
Science 244:1081-1085 (1989). The resulting mutant molecules can then be tested for 

35 biological activity. 

As the authors state, these two strategies have revealed that proteins are surprisingly 
tolerant of amino acid substitutions. The authors further indicate which amino acid changes 
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are likely to be permissive at certain amino acid positions in the protein. For example, most 
buried (within the tertiary structure of the protein) amino acid residues require nonpolar side 
chains, whereas few features of surface side chains are generally conserved. Moreover, 
tolerated conservative amino acid substitutions involve replacement of the aliphatic or 
5 hydrophobic amino acids Ala, Val, Leu and He; replacement of the hydroxyl residues Ser and 
Thr; replacement of the acidic residues Asp and Glu; replacement of the amide residues Asn 
and Gin, replacement of the basic residues Lys, Arg, and His; replacement of the aromatic 
residues Phe, Tyr, and Trp, and replacement of the small-sized amino acids Ala, Ser, Thr, 
Met, and Gly. Besides conservative amino acid substitution, variants of the present invention 

10 include (i) polypeptides containing substitutions of one or more of the non-conserved amino 
acid residues, where the substituted amino acid residues may or may not be one encoded by 
the genetic code, or (ii) polypeptides containing substitutions of one or more of the amino acid 
residues having a substituent group, or (iii) polypeptides which have been fused with or 
chemically conjugated to another compound, such as a compound to increase the stability 

15 and/or solubility of the polypeptide (for example, polyethylene glycol), (iv) polypeptide 
containing additional amino acids, such as, for example, an IgG Fc fusion region pieptide, . 
Such variant polypeptides are deemed to be within the scope of those skilled in the art from 
the teachings herein. 

For example, polypeptide variants containing amino acid substitutions of charged 
20 amino acids with other charged or neutral amino acids may produce proteins with improved 
characteristics, such as less aggregation. Aggregation of pharmaceutical formulations both 
reduces activity and increases clearance due to the aggregate's immunogenic activity. See 
Pinckard et al., Clin. Exp. Immunol. 2:331-340 (1967); Robbins et al., Diabetes 36: 838-845 
(1987); Cleland et al., Crit. Rev. Therapeutic Drug Carrier Systems 10:307-377 (1993). 
25 In specific embodiments, the polypeptides of the invention comprise; or alternatively, 

consist of, fragments or variants of the amino acid sequence of a Therapeutic protein 
described herein and/or human serum albumin, and/or albumin fusion protein of the 
invention, wherein the fragments or variants have 1-5, 5-10, 5-25, 5-50, 10-50 or 50-150, 
amino acid residue additions, substitutions, and/or deletions when compared to the reference 
30^ amino acid sequence. In preferred embodiments, the amino acid substitutions are 
conservative. Nucleic acids encoding these polypeptides are also encompassed by the 
invention. 

The polypeptide of the present invention can be composed of amino acids joined to 
each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres, and may 
35 contain amino acids other than the 20 gene-encoded amino acids. The polypeptides may be 
modified by either natural processes, such as post-translational processing, or by chemical 
modification techniques which are well known in the art. Such modifications are well 
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described in basic texts and in more detailed monographs, as well as in a voluminous research 
literature. Modifications can occur anywhere in a polypeptide, including the peptide 
backbone, the amino acid side-chains and the amino or carboxyl termini. It will be 
appreciated that the same type of modification may be present in the same or varying degrees 
at several sites in a given polypeptide. Also, a given polypeptide may contain many types of 
modifications. Polypeptides may be branched, for example, as a result of ubiquitination, and 
they may be cyclic, with or without^branching. Cyclic, branched, and branched cyclic 
polypeptides may result from posttranslation natural processes or may be made by synthetic 
methods. Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent 
attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a 
nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, covalent 
attachment of phosphotidylinositoi, cross-linking, cyclization, disulfide bond formation, 
demethylation, formation of covalent cross-links, formation of cysteine, formation of 
pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI anchor formation, 
hydroxylation, iodination, methylation, myristylation, oxidation, pegylation, proteolytic 
processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, transfer- 
RNA mediated addition of amino acids to proteins such as arginylation, and ubiquitination. 
(See, for instance, PROTEINS - STRUCTURE AND MOLECULAR PROPERTIES, 2nd 
Ed., T. E. Creighton, W. H. Freeman and Company, New York (1993); POST- 
TRANSLATIONAL COVALENT MODIFICATION OF PROTEINS, B. C. Johnson, Ed., 
Academic Press, New York, pgs. 1-12 (1983); Seifter et al., Meth. Enzymol. 182:626-646 
(1990); Rattan et al., Ann. N.Y. Acad. Sci. 663:48-62 (1992)). 

Functional activity 

"A polypeptide having functional activity" refers to a polypeptide capable of 
displaying one or more known functional activities associated with the full-length, pro- 
protein, and/or mature form of a Therapeutic protein. Such functional activities include, but 
are not limited to, biological activity, antigenicity [ability .to bind (or compete with a 
polypeptide for binding) to an anti-pqlypeptide antibody], immunogenicity (ability to generate 
antibody which binds to a specific polypeptide of the invention), ability to form multimers 
with polypeptides of the invention, and ability to bind to a receptor or ligand for a 
polypeptide. 

"A polypeptide having biological activity" refers to a polypeptide exhibiting activity 
similar to, but not necessarily identical to, an activity of a Therapeutic protein of the present 
invention, including mature forms, as measured in a particular biological assay, with or 
without dose dependency. In the case where dose dependency does exist, it need not be 
identical to that of the polypeptide, but rather substantially similar to the dose-dependence in a 
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given activity as compared to the polypeptide of the present invention (i.e., the candidate 
polypeptide will exhibit greater activity or not more than about 25-fold less and, preferably, 
not more than about tenfold less activity, and most preferably, not more than about three-fold 
less activity relative to the polypeptide of the present invention). 

In preferred embodiments, an albumin fusion protein of the invention has at least one 
biological and/or therapeutic activity associated with the Therapeutic protein (or fragment or 
variant thereof) when it is not fused to albumin. 

The albumin fusion proteins of the invention can be assayed for functional activity 
(e.g., biological activity) using or routinely modifying assays known in the art, as well as 
assays described herein. Specifically, albumin fusion proteins may be assayed for functional 
activity (e.g., biological activity or therapeutic activity) using the assay referenced in the 
"Exemplary Activity Assay" column of Table 1. Additionally, one of skill in the art may 
routinely assay fragments of a Therapeutic protein corresponding to a Therapeutic protein 
portion of an albumin fusion protein of the invention, for activity using assays referenced in 
its corresponding row of Table 1. Further, one of skill in the art may routinely assay 
fragments of an albumin protein corresponding to an albumin protein portion of an albumin 
fusion protein of the invention, for activity using assays known in the art and/or as described 
in the. Examples section below. 

For example, in one embodiment where one is assaying for the ability of an albumin 
fusion protein of the invention to bind or compete with a Therapeutic protein for binding to 
an anti-Therapeutic polypeptide antibody and/or anti-albumin antibody, various 
immunoassays known in the art can be used, including but not limited to, competitive and 
non-competitive assay systems using techniques such as radioimmunoassays, ELISA 
(enzyme linked immunosorbent assay), "sandwich" immunoassays, immunoradiometric 
assays, gel diffusion precipitation reactions, immunodiffusion assays, in situ immunoassays 
(using colloidal gold, enzyme or radioisotope labels, for example), western blots, 
precipitation reactions, agglutination assays (e.g., gel agglutination assays, hemagglutination 
assays), complement fixation assays, immunofluorescence assays, protein A assays, and 
Immunoelectrophoresis assays, etc. In one embodiment, antibody binding is detected by 
detecting a label on the primary antibody. In another embodiment, the primary antibody is 
detected by detecting binding of a secondary antibody or reagent to the primary antibody. In a 
further embodiment, the secondary antibody is labeled. Many means are known in the art for 
detecting binding in an immunoassay and are within the scope of the present invention. 

In a preferred embodiment, where a binding partner (e.g., a receptor or a ligand) of a 
Therapeutic protein is identified, binding to that binding partner by an albumin fusion protein 
containing that Therapeutic protein as the Therapeutic protein portion of the fusion can be 
assayed, e.g., by means well-known in the art, such as, for example, reducing and non- 
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reducing gel chromatography, protein affinity chromatography, and affinity blotting. See 
generally, Phizicky et al., Microbiol. Rev. 59:94-123 (1995). In another embodiment, the 
ability of physiological correlates of an albumin fusion protein of the present invention to 
bind to a substrate(s) of the Therapeutic polypeptide corresponding to the Therapeutic portion 
5 of the albumin fusion protein of the invention can be routinely assayed using techniques 
known in the art. 

In an alternative embodiment, where the ability of an albumin fusion protein of the 
invention to multimerize is being evaluated, association with other components of the 
multimer can be assayed, e.g., by means well-known in the art, such as, for example, 

10 reducing and non-reducing gel chromatography, protein affinity chromatography, and affinity 
blotting. See generally, Phizicky et al., supra. 

In addition, assays described herein (see Examples and Table 1) and otherwise known 
in the art may routinely be applied to measure the ability of albumin fusion proteins of the 
present invention and fragments, variants and derivatives thereof to elicit biological activity 

15 and/or Therapeutic activity (either in vitro or in vivo) related to either the Therapeutic protein 
portion and/or albumin portion of the albumin fusion protein of the present invention. Other 
methods will be known to the skilled artisan and are within the scope of the invention. 

Albumin 

20 ' As described above, an albumin fusion- protein of the invention comprises at least a 

fragment or variant of a Therapeutic protein and at least a fragment or variant of human serum 
albumin, which are associated with one another, preferably by genetic fusion or chemical 
conjugation. , 

The terms, human serum albumin (HSA) and human albumin (HA) are used 
25 interchangeably herein. The terms, "albumin and "serum albumin" are broader, and 
encompass human serum albumin (and fragments and variants thereof) as well as albumin 
from other species (and fragments and variants thereof). 

As used herein, "albumin" refers collectively to albumin protein or amino acid 
sequence, or an albumin fragment or variant, having one or more functional activities (e.g., 
30 biological activities) of albumin. In particular, "albumin" refers to human albumin or 
fragments thereof (see EP 201 239, EP 322 094 WO 97/24445, W095/23857) especially the 
mature form of human albumin as shown in Figure 15 and SEQ ID NO: 18, or albumin from 
" other vertebrates or fragments thereof, or analogs or variants of these molecules or fragments 
thereof. 

35 In preferred embodiments, the human serum albumin protein used in the albumin 

fusion proteins of the invention contains one or both of the following sets of point mutations 
with reference to SEQ ID NO: 18: Leu-407 to Ala, Leu-408 to Val, Val^409. to Ala; and Arg- 
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410 to Ala; or Arg-410 to A, Lys-413 to Gin, and Lys-414 to Gin (see, e.g., International 
Publication No. W095/23857, hereby incorporated in its entirety by reference herein). In 
even more preferred embodiments, albumin fusion proteins of the invention that contain one 
or both of above-described sets of point mutations have improved stability/resistance to yeast 
5 Yap3p proteolytic cleavage, allowing increased production of recombinant albumin fusion 
proteins expressed in yeast host cells. 

As used herein, a portion of albumin sufficient to prolong the therapeutic activity or 
shelf -life of the Therapeutic protein refers to a portion of albumin sufficient in Jength or 
structure to stabilize or prolong the therapeutic activity of the protein so that the shelf life of 

10 the Therapeutic protein portion of the albumin fusion protein is prolonged or extended 
compared to the shelf-life in the non-fusion state. The albumin portion of the albumin fusion 
proteins may comprise the full length of the HA sequence as described above or as shown in 
Figure 15, or may include one or more fragments thereof that are capable of stabilizing or 
prolonging the therapeutic activity. Such fragments may be of 10 or more amino acids in 

15 length or may include about 15, 20, 25, 30, 50, or more contiguous amino acids from the HA 
sequence or may include part or all of specific domains of HA. For instance, one or more 
fragments of HA spanning the first two immunoglobulin-like domains may be used. 

The albumin portion of the albumin fusion proteins of the invention may be a variant 
of normal HA. The Therapeutic protein portion of the albumin fusion proteins of the 

20 invention may also be variants of the Therapeutic proteins as described herein. The term 
"variants" includes insertions, deletions and substitutions, either conservative or non 
conservative, where such changes do not substantially alter one or more of the oncotic, useful 
ligand-binding and non-immunogenic properties of albumin, or the active site, or active 
domain which confers the therapeutic activities of the Therapeutic proteins. 

25 In particular, the albumin fusion proteins of the invention may include naturally 

occurring polymorphic variants of human albumin and fragments of human albumin, for 
example those fragments disclosed in EP 322 094 (namely HA (Pn), where n is 369 to 419). 
The albumin may be derived from any vertebrate, especially any mammal, for example 
human, cow, sheep, or pig. Non-mammalian albumins include, but are not limited to, hen 

30 and salmon. The albumin portion of the albumin fusion protein may be from a different 
animal than the Therapeutic protein portion. 

Generally speaking, an HA fragment or variant will be at least 100 amino acids long, 
preferably at least 150 amino acids long/ The HA variant may consist of or alternatively 
comprise at least one whole domain of HA, for example domains 1 (amino acids 1-194 of 

35 SEQ ID NO: 18), 2 (amino acids 195-387 of SEQ ID NO: 18), 3 (amino acids 388-585 of SEQ 
ID NO:18), 1 + 2 (1-387 of SEQ ID NO:18), 2 + 3 (195-585 of SEQ ID NO:18) or 1 + 3 
(amino acids 1-194 of SEQ ID NO:18 + amino acids 388-585 of SEQ ID NO: 18). Each 
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domain is itself made up of two homologous subdomains namely 1-105, 120-194, 195-291, 
316-387, 388-491 and 512-585, with flexible inter-subdomain linker regions comprising 
residues Lys 1 06 to Glu 1 1 9, GIu292 to Val 3 15 and Glu492 to Ala5 1 1 . 

Preferably, the albumin portion of an albumin fusion protein of the invention 
comprises at least one subdomain or domain of HA or conservative modifications thereof. If 
the fusion is based on subdomains, some or all of the adjacent linker is preferably used to link 
to the Therapeutic protein moiety- 
Albumin Fusion Proteins 

The present invention relates generally to albumin fusion proteins and methods of 
treating, preventing, or ameliorating diseases or disorders. As used herein, "albumin fusion 
protein" refers to a protein formed by the fusion of at least one molecule of albumin (or a 
fragment or variant thereof) to at least one molecule of a Therapeutic protein (or fragment or 
variant thereof). An albumin fusion protein of the invention comprises at least a fragment or 
variant of a Therapeutic protein and at least a fragment or variant of human serum albumin, 
which are associated with one another, preferably by genetic fusion (i.e., the albumin fusion 
protein is generated by translation of a nucleic acid in which a polynucleotide encoding all or a 
portion of a Therapeutic protein is joined in-frame with a polynucleotide encoding all or a 
portion of albumin) or chemical conjugation to one another. The Therapeutic protein and 
albumin protein, once part of the albumin fusion protein, may be referred to as a "portion", 
"region" or "moiety" of the albumin fusion protein. 

In one embodiment, the invention provides an albumin fusion protein comprising, or 
alternatively consisting of, a Therapeutic protein (e.g., as described in Table 1) and a serum 
albumin protein; In other embodiments, the invention provides an albumin fusion protein 
comprising, or alternatively consisting of, a biologically active and/or therapeutically active 
fragment of a Therapeutic protein and a serum albumin protein. In other embodiments, the 
invention provides an albumin fusion protein comprising, or alternatively consisting of, a 
biologically active and/or therapeutically active variant of a Therapeutic protein and a serum 
albumin protein. In preferred embodiments, the serum albumin protein component of the 
albumin fusion protein is the mature portion of serum albumin. 

In further embodiments, the invention provides an albumin fusion protein comprising, 
or alternatively consisting of, a Therapeutic protein, and a biologically active and/or 
therapeutically active fragment of serum albumin. In further embodiments, the invention 
provides an albumin fusion protein comprising, or alternatively consisting of, a Therapeutic 
protein and a biologically active and/or therapeutically active variant of serum albumin. In 
- preferred embodiments, the Therapeutic protein portion of the albumin fusion protein is the 
mature portion of the Therapeutic protein. 
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In further embodiments, the invention provides an albumin fusion protein comprising, 
or alternatively consisting of, a biologically active and/or therapeutically active fragment or 
variant of a Therapeutic protein and a biologically active and/or therapeutically active fragment 
or variant of serum albumin. In preferred embodiments, the invention provides an albumin 
5 fusion protein comprising, or alternatively consisting of, the mature portion of a Therapeutic 
protein and the mature portion of serum albumin. ^ 

Preferably, the albumin fusion protein comprises HA as the N-terminal portion, and a 
Therapeutic protein as the C-terminal portion. Alternatively, an albumin fusion protein 
comprising HA as the C-terminal portion, and a Therapeutic protein as the N-terminal portion 

10 may also be used. 

In other embodiments, the albumin fusion protein has a Therapeutic protein fused to 
both the N-terminus and the C-terminus of albumin. In a preferred embodiment, the 
Therapeutic proteins fused at the N- and C- termini are the same Therapeutic proteins. In a 
preferred embodiment, the Therapeutic proteins fused at the N- and C- termini are different 

15 Therapeutic proteins. In another preferred embodiment, the Therapeutic proteins fused at the 
N- and C- termini are different Therapeutic proteins which may be used to treat or prevent the 
same disease, disorder, or condition (e.g. as listed in the "Preferred Indication Y" column of 
Table 1). In another preferred embodiment, the Therapeutic proteins fused at the N- and C- 
termini are different Therapeutic proteins which may be used to treat or prevent diseases or 

20 disorders (e.g. as listed in the "Preferred Indication Y" column of Table 1) which are known 
in the art to commonly occur in patients simultaneously. 

In addition to albumin fusion protein in which the albumin portion is fused N- 
terminal and/or C-terminal of the Therapeutic protein portion, albumin fusion proteins of the 
invention may also be produced by inserting the Therapeutic protein or peptide of interest 

25 (e.g., Therapeutic protein X as diclosed in Table 1) into an internal region of HA. For 
instance, within the protein sequence of the HA molecule a number of loops or turns exist 

between the end and beginning of a-helices, which are stabilized by disulphide bonds (see 

* 

Figures 9-11). The loops, as determined from the crystal structure of HA (Fig. 13) (PDB 
identifiers 1A06, 1BJ5, l'BKE, 1BM0, 1E7E to 1E7I and 1UOR) for the most part extend 

30 away from the body of the molecule. These loops are useful for the insertion, or internal 
fusion, of therapeutically active peptides, particularly those requiring a secondary structure to 
be functional, or Therapeutic proteins, to essentially generate an albumin molecule with 
specific biological activity. 

Loops in human albumin structure into which peptides or polypeptides may be 

35 inserted to generate albumin fusion proteins of the invention include: Val54-Asn61, Thr76- 
Asp89, Ala92-Glul00, Glnl70-Alal76, His247-Glu252, Glu266-Glu277, Gju280-His288, 
. Ala362-Glu368, Lys439-Pro447,Val462-Lys475, Thr478-Pro486, and Lys560-Thr566. In 
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more preferred embodiments, peptides or polypeptides are inserted into the VaI54-Asn61, 
Glnl70-Alal76, and/or Lys560-Thr566 loops of mature human albumin (SEQ ID NO: 18)! 

Peptides to be inserted may be derived from either phage display or synthetic peptide 
libraries screened for specific biological activity or from the active portions of a molecule with 
5 the desired function. Additionally, random peptide libraries may be generated within 
particular loops or by insertions of randomized peptides into particular loops of the HA 
molecule and in which all possible combinations of amino acids are represented. 

Such library(s) could be generated on HA or domain fragments of HA by one of the 
following methods: 

10 (a) randomized mutation of amino acids within one or more peptide loops of HA 

or HA domain fragments. Either one, more or all the residues within a loop could be mutated 
in this manner (for example see Fig. 10a); 

(b) replacement of, or insertion into one or more loops of HA or HA domain 
fragments (i.e., internal fusion) of a randomized peptide(s) of length X n (where X is an amino 

15 acid and n is the number of residues (for example see Fig. 10b); 

(c) N-, C- or N- and C- terminal peptide/protein fusions in addition to (a) and/or 

(b). 

The HA or HA domain fragment may also be made multifunctional by grafting the 
peptides derived from different screens of different loops against different targets into the 

20 same HA or HA domain fragment. 

In preferred embodiments, peptides inserted into a loop of human serum albumin are 
peptide fragments or peptide variants of the Therapeutic proteins disclosed in Table 1. More 
particulary, the invention encompasses albumin fusion proteins w hich comprise peptide 
fragments or peptide variants at least 7 at least 8, at least 9, at least 10, at least 11, at least 12, 

25 at least 13, at least 14, at least 15, at least 20, at least 25, at least 30, at least 35, or at least 40 
amino acids in length inserted into a loop of human serum albumin. The invention also 
encompasses albumin fusion proteins which comprise peptide fragments or peptide variants at 
least 7 at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 
15, at least 20, at least-25, at least 30, at least 35, or at least 40 amino acids fused to the N- 

30 terminus of human serum albumin. The invention also encompasses albumin fusion proteins 
which comprise peptide fragments or peptide variants at least 7 at least 8, at least 9, at least, 
10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 20, at least 25, at least 
30, at least 35, onat least 40 amino acids fused to the C-terminus of human serum albumin. 

Generally, the albumin fusion proteins of the invention may have one HA-derived 

35 region and one Therapeutic protein-derived region. Multiple regions of each protein, 
however, may be used to make an albumin fusion protein of the invention. Similarly, more 
than one Therapeutic protein may be used to make an albumin fusion protein of the invention. 
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For instance, a Therapeutic protein may be fused to both the N- and C-terminal ends of the 
HA. In such a configuration, the Therapeutic protein portions may be the same or different 
Therapeutic protein molecules. The structure of Afunctional albumin fusion proteins may be 
represented as: X-HA-Y or Y-HA-X. 

5 For example, an anti-BLyS™ scFv-HA-IFNa-2b fusion may be prepared to modulate 

the immune response to IFNa-2b by anti-BLyS™ . scFv. An alternative is making a bi (or 

- even multi) functional dose of HA-fusions e.g. HA-IFNa-2b fusion mixed with HA-anti- 

BLyS™ scFv fusion or other HA-fusions in various ratio's depending on function, half-life 
etc. 

10 Bi- or multi-functional albumin fusion proteins may also be prepared to target the 

Therapeutic protein portion of a fusion to a target organ or cell type via protein or peptide at 
the opposite terminus of HA. 

As an alternative to the fusion of known therapeutic molecules, the peptides could be 
obtained by screening libraries constructed as fusions to the N-, C- or N- and C- termini of 

15 HA, or domain fragment of HA, of typically 6, 8, 12, 20 or 25 or X n (where X is an amino 
acid (aa) and n equals the number of residues) randomized amino acids, and in which all 
possible combinations of amino acids were represented. A particular advantage of this 
approach is that the peptides may be selected in situ on the HA molecule and the properties of 
the peptide would therefore be as selected for rather than, potentially, modified as might be 

20 the case for a peptide derived by any other method then being attached to HA. 

Additionally, the albumin fusion proteins of the invention may include a linker peptide 
between the fused portions to provide greater physical separation between the moieties and 
thus maximize the accessibility of the Therapeutic protein portion, for instance, for binding to 
its cognate receptor. The linker peptide may consist of amino acids such that it is flexible or 

25 more rigid. 

The linker sequence may be cleavable by a protease or chemically to yield the growth 
hormone related moiety. Preferably, the protease is one which is produced naturally by the 
host, for example the S. cerevisiae protease kex2 or equivalent proteases. 

Therefore, as described above, the albumin fusion proteins of the invention may have 
30 the following formula R1-L-R2; R2-L-R1; or R1-L-R2-L-R1, wherein Rl is at least one 
Therapeutic . protein, peptide or polypeptide sequence, and not necessarily the same 
Therapeutic protein, L is a linker and R2 is a serum albumin sequence. 

In preferred embodiments, Albumin fusion proteins of the invention comprising a 
Therapeutic protein have extended shelf life compared to the shelf life the same Therapeutic 
35 protein when not fused to albumin. Shelf -life typically refers to the time period over which 
the therapeutic activity of a Therapeutic protein in solution or in some other storage 
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formulation, is stable without undue loss of therapeutic activity. Many of the Therapeutic 
proteins are highly labile in their unfused state. As described below, the typical shelf-life of 
these Therapeutic proteins is markedly prolonged upon incorporation into the albumin fusion 
protein of the invention. 

5 Albumin fusion proteins of the invention with "p r °l° n g ecI " or "extended" shelf-life 

exhibit greater therapeutic activity relative to a standard that has been subjected to the same 
storage and handling conditions. The standard may be the unfused full-length Therapeutic 
protein. When the Therapeutic protein portion of the albumin fusion protein is an analog, a 
variant, or is otherwise altered or does not include the complete sequence for that protein, the 

1 0 prolongation of therapeutic activity may alternatively be compared to the unfused equivalent of 
that analog, variant, altered peptide or incomplete sequence. As an example, an albumin 
fusion protein of the invention may retain greater than about 100% of the therapeutic activity, 
or greater than about 105%, 110%, 120%, 130%, 150% or 200% of the therapeutic activity 
of a standard when subjected to the same storage and handling conditions as the standard 

15 when compared at a given time point. 

Shelf-life may also be assessed in terms of therapeutic activity remaining after storage, 
normalized to therapeutic activity when storage began. Albumin fusion proteins of the 
invention with prolonged or extended shelf-life as exhibited by prolonged or extended 
therapeutic activity may retain greater than about 50% of the therapeutic activity, about 60%, 

20 70%, 80%, or 90% or more of the therapeutic activity of the equivalent unfused Therapeutic 
protein when subjected to the same conditions. For example, as discussed in Example 1, an 
albumin fusion protein of the invention comprising hGH fused to the full length HA sequence 
may retain about 80% or more of its original activity in solution for periods of up to 5 weeks 
or more under various temperature conditions. 

25 

Expression of Fusion Proteins 

The albumin fusion proteins of the invention may be produced as recombinant 
molecules by secretion from yeast, a microorganism such as a bacterium, or a human or 
animal cell line. Preferably, the polypeptide is secreted from the host cells. We have found 

30 that, by fusing the hGH coding sequence to the HA coding sequence, either to the 5' end or 
3* end, it is possible to secrete the albumin fusion protein from yeast without the requirement 
for a yeast-derived pro sequence. This was surprising, as other workers have found that a 
yeast derived pro sequence was needed for efficient secretion of hGH in yeast. 

For example, Hiramatsu et ah (Appl Environ Microbiol 56:2125 (1990); Appl Environ 

35 Microbiol 57:2052 (1991)) found that the N-terminal portion of the pro sequence in the Mucor 

pusillus rennin pre 7 pro leader was important. Other authors, using the MFa-1 signal, have 
always included the MFa-1 pro sequence when secreting hGH. The pro sequences were 
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believed to assist in the folding of the hGH by acting as an intramolecular chaperone. The 
present invention shows that HA or fragments of HA can perform a similar function. 

Hence, a particular embodiment of the invention comprises a DNA construct 
encoding a signal sequence effective for directing secretion in yeast, particularly a 
5 yeast-derived signal sequence (especially one which is homologous to the yeast host), and the 
fused molecule of the first aspect of the invention, there being no yeast-derived pro sequence 
between the signal and the mature polypeptide. 

The Saccharomyces cerevisiae invertase signal is a preferred example of a 
yeast-derived signal sequence. 
10 Conjugates of the kind prepared by Poznansky et ah, (FEBS Lett. 239:18 (1988)), in 

which separately-prepared polypeptides are joined by chemical cross-linking, are not 
contemplated. 

The present invention also includes a cell, preferably a yeast cell transformed to 
express an albumin fusion protein of the invention. In addition to the transformed host cells 
15 themselves, the present invention also contemplates a culture of those cells, preferably a 
monoclonal (clonally homogeneous) culture, or a culture derived from a monoclonal culture, 
in a nutrient medium. If the polypeptide is secreted, the medium will contain the polypeptide, 
with the cells, or without the cells if they have been filtered or centrifuged away. Many 
expression systems are known and may be used, including bacteria (for example E. coli and 
- 20 Bacillus subtilis)? yeasts (for example Saccharomyces cerevisiae, Kluyveromyces lactis and 
Pichia pastoris, filamentous fungi (for example Aspergillus), plant cells, animal cells and 
insect cells. 

Preferred yeast strains to be used in the. production of albumin fusion proteins are 
D88, DXY1 and BXP10. D88 [Zeu2-3, leu2-122, canl, pral, ubc4] is a derivative of parent 

25 strain AH22faV (also known as DB1; see, e.g., Sleep et ah Biotechnology 8:42-46 (1990)). 
The strain contains a leu2 mutation which allows for auxotropic selection of 2 micron-based 
plasmids that contain the LEU2 gene. D88 also exhibits a derepression of PRB1 in glucose 
excess. The PRB1 promoter is normally controlled by two checkpoints that monitor glucose 
levels and growth stage. The promoter is activated in wild type yeast upon glucose depletion 

30 and entry into stationary phase. Strain D88 exhibits the repression by glucose but maintains 
the induction upon entry into stationary phase. The PRA1 gene encodes a yeast vacuolar 
protease, YscA endoprotease A, that is localized in the ER. The UBC4 gene is in the 
ubiquitination pathway and is involved in targeting short lived and abnormal proteins for 
ubiquitin dependant degradation. Isolation of this ubc4 mutation was found to increase the 

35 copy number of an expression plasmid in the cell and cause an increased level of expression 
of a desired protein expressed from the plasmid (see, e.g., International Publication No. 
WO99/00504, hereby incorporated in its entirety by reference herein). 
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DXY1, a derivative of D88, has the following genotype: [leu2-3, leu2-122, canl, 
pral, ubc4, ura3:ryap3]. In addition to the mutations isolated in D88, this strain also has a 
knockout of the YAP3 protease. This protease causes cleavage of mostly di-basic residues 
(RR, RK, KR, KK) but can also promote cleavage at single basic residues in proteins. 
5 Isolation of this yap3 mutation resulted in higher levels of full length HSA production (see, 
e.g., U.S. Patent No. 5,965,386, and Kerry-Williams et ah, Yeast 14:161-169 (1998), 
hereby incorporated in their entireties by reference herein). 

BXP10 has the following genotype: leu2-3, leu2-122 9 canl^ pral, ubc4 y ura3 y 
yap3::URA3, lys2, hspl50::LYS2, pmtl::URA3, In addition to the mutations isolated in 

10 DXY1, this strain also has a knockout of the PMT1 gene and the HSP150 gene. The PMT1 
gene is a member of the evolutionarily conserved family of doIichyl-phosphate-D-mannose 
protein O-mannosyltransferases (Pmts). The transmembrane topology of Pmtlp suggests 
that it is an integral membrane protein of the endoplasmic reticulum with a role in O-linked 
glycosylation. This mutation serves to reduce/eliminate O-linked glycosylation of HSA 

15 fusions (see, e.g., International Publication No. WO00/44772, hereby incorporated in its 
entirety by reference herein). Studies revealed that the HsplSO protein is inefficiently 
separated from rHA by ion exchange chromatography. The mutation in the HSP150 gene 
removes a potential contaminant that has proven difficult to remove by standard purification 
techniques. See, e.g., U.S. Patent No. 5,783,423, hereby incorporated in its entirety by 

20 reference herein. 

The desired protein is produced in conventional ways, for example from a coding 
sequence inserted in the host chromosome or on a free plasmid. The yeasts are transformed 
with a coding sequence for the desired protein in any of the usual ways, for example 
electroporation. Methods for transformation of yeast by electroporation are disclosed in 

25 Becker & Guarente (1990) Methods Enzymoh 194, 182. 

Successfully transformed cells, i.e., cells that contain a DNA construct of the present 
inventiou, can be identified by well known techniques. For example, cells resulting from the 
introduction of an expression construct can be grown to produce the desired polypeptide. 
Cells can be harvested and lysed and their DNA content examined for the presence of the 

30 DNA using a method such as that described by Southern (1975) J. Mol. Biol. 98, 503 or 
Berent et al (1985) Biotech. 3, 208. Alternatively, the presence of the protein in the 
supernatant can be detected using antibodies. 

Useful^ yeast plasmid vectors include pRS403-406 and pRS413-416 and are 
generally available from Stratagene Cloning Systems, La Jolla, CA 92037, USA. Plasmids 

35 pRS403, pRS404, pRS405 and pRS406 are Yeast Integrating plasmids (Yips) and 
incorporate the yeast selectable markers HIS3, 7RP1, LEU2 and URA3. Plasmids 
pRS413-416 are Yeast Centromere plasmids (Xcps). 
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Preferred vectors for making albumin fusion proteins for expression in yeast include 
pPPCOOOS, pScCHSA, pScNHSA, and pC4:HSA which are described in detail in Example 
2. Figure 4 shows a map of the pPPCOOOS plasmid that can be used as the base vector into 
which polynucleotides encoding Therapeutic proteins may be cloned to form HA-fusions. It 
5 contains a PRB1 S. cerevisiae promoter (PRBlp), a Fusion leader sequence (FL) 7 DNA 
encoding HA (rHA) and an ADH1 S. cerevisiae terminator sequence. The sequence of the 
fusion leader sequence consists of the first 19 amino acids of the signal peptide of human 
serum albumin (SEQ ID NO: 29) and the last five amino acids of the mating factor alpha 1 
promoter (SLDKR, see EP-A-387 319 which is hereby incorporated by reference in its 
10 entirety. 

The plasmids, pPPCOOOS, pScCHSA, pScNHSA, and pC4:HSA were deposited on 
April 11, 2001 at the American Type Culture Collection, 10801 University Boulevard, 

Manassas, Virginia 20110-2209 and given accession numbers ATCC , , 

and , respectively. Another vector useful for expressing an albumin fusion protein in 

15 yeast the pSAC35 vector which is described in Sleep et aL, BioTechnology 8:42 (1990) 
which is hereby incorporated by reference in its entirety. 

A variety of methods have been developed to operably link DNA to vectors via 
complementary cohesive termini. For instance, complementary homopolymer tracts can be 
added to the DNA segment to be inserted to the vector DNA. The vector and DNA segment 
20 are then joined by hydrogen bonding between the complementary homopolymeric tails to 
form recombinant DNA molecules. 

Synthetic linkers containing one or more restriction sites provide an alternative method 
of joining the DNA segment to vectors. The DNA segment, generated by endonuclease 
restriction digestion, is treated with bacteriophage T4 DNA polymerase or E. coli DNA 

25 polymerase I, enzymes that remove protruding, y-single-stranded teimini with their 3 1 

5 f -exonucleolytic activities, and fill in recessed 3 -ends with their polymerizing activities. 

The combination of these activities therefore generates blunt-ended DNA segments. 
The blunt-ended segments are then incubated with a large molar excess of linker molecules in 
the presence of an enzyme that is able to catalyze the ligation of blunt-ended DNA molecules, 
30 such as bacteriophage T4 DNA ligase. Thus, the products of the reaction are DNA segments 
- carrying polymeric linker sequences at their ends. These DNA segments are then cleaved with 
. the appropriate restriction enzyme and ligated to an expression vector that has been cleaved 
with an enzyme that produces termini compatible with those of the DNA segment. 

Synthetic linkers containing a variety of restriction endonuclease sites are 
35 commercially available from a number of sources including International Biotechnologies Inc, 
New -Haven, CT, USA. 

A desirable way to modify the DNA in accordance with the invention, if, for example, 
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HA variants are to be prepared, is to use the polymerase chain reaction as disclosed by Saiki 
etal. (1988) Science 239, 487-491. In this method the DNA to be enzymatically amplified is 
flanked by two specific oligonucleotide primers which themselves become incorporated into 
the amplified DNA. The specific primers may contain restriction endonuclease recognition 
sites which can be used for cloning into expression vectors using methods known in the art. 

Exemplary genera of yeast contemplated to be useful in the practice of the present 
invention as hosts for expressing the albumin fusion proteins are Pichia (formerly classified 
as Hansenula), Saccharomyces, Kluyveromyces, Aspergillus, Candida, Torulopsis, 
Torulaspora, Schizosaccharomyces, Citeromyces, Pachysolen, Zygosaccharomyces, 
Debaromyces, Trichoderma, Cephalosporiwn, Humicola, Mucor, Neurospora, Yarrowia, 
Metschunikowia, Rhodosporidium, Leucosporidium, Botryoascus, Sporidiobolus, 
Endomycopsis, and the like. Preferred genera are those selected from the group consisting of 
Saccharomyces, Schizosaccharomyces, Kluyveromyces, Pichia and Torulaspora. Examples 
of Saccharomyces spp. are S. cerevisiae, S. italicus and S. rouxii. 

Examples of Kluyveromyces spp. are K. fragilis, K. lactis and K. marxianus. A 
suitable Torulaspora species is T. delbrueckii. Examples of Pichia (Hansenula) spp. are P. 
angusta (formerly H. polymorpha), P. anomala (formerly H. anomald) and P. pastoris. 
Methods for the transformation of S. cerevisiae are taught generally in EP 251 744, EP 258 
067 and WO 90/01063, all of which are incorporated herein by reference. 

Preferred exemplary species of Saccharomyces include S. cerevisiae, 5, italicus, S . 
diastaticus, and Zygosaccharomyces rouxii. Preferred exemplary species of Kluyveromyces 
include K fragilis and K lactis. Preferred exemplaiy species of Hansenula include H. 
polymorpha (now Pichia angusta), H. anomala (how Pichia anomala), and Pichia capsulata. 
Additional preferred exemplary species of Pichia include P. pastoris. Preferred exemplary 
species of Aspergillus include A. niger and A. nidulans. Preferred exemplary species of 
Yarrowia include Y. lipolytica. Many preferred yeast species are available from the ATCC. 
For example, the following preferred yeast species are available from the ATCC and are 
useful in the expression of albumin fusion proteins: Saccharomyces cerevisiae Hansen, 
teleomorph strain BY4743 yap3 mutant (ATCC Accession No. 4022731); Saccharomyces 
cerevisiae Hansen, teleomorph strain BY4743 hsplSO mutant (ATCC Accession No. 
4021266); Saccharomyces cerevisiae Hansen, teleomorph strain BY4743 pmtl mutant 
(ATCC Accession No. 4023792); Saccharomyces cerevisiae Hansen, teleomorph (ATCC 
Accession Nos. 20626; 44773; 44774; and 62995); Saccharomyces diastaticus Andrews et 
Gilliland ex van der Walt, teleomorph (ATCC Accession No. 62987); Kluyveromyces lactis 
(Dombrowski) van der Walt, teleomorph (ATCC Accession No. 76492); Pichia angusta 
(Teunisson et al.) Kurtzman, teleomorph deposited as Hansenula polymorpha de Morais et 
Maia, teleomorph (ATCC Accession No. 26012); Aspergillus niger van Tieghem, anamorph 
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(ATCC Accession No. 9029); Aspergillus niger van Tieghem, anamorph (ATCC Accession 
No. 16404); Aspergillus nidulans (Eidam) Winter, anamorph (ATCC Accession No. 48756); 
and Yarrowia lipolytica (Wickerham et al.) van der Walt et von Arx, teieomorph (ATCC 
Accession No. 201847). ■ 
5 Suitable promoters for S. cerevisiae include those associated with the PGKI gene, 

GAL1 or GAL10 genes, CYCI, PH05, TRPI, ADHI, ADH2, the genes for 
gIyceraldehyde-3~phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, 
phosphofructokinase, triose phosphate isomerase, phosphoglucose isomerase, glucokinase, 
alpha-mating factor pheromone, [a mating factor pheromone], the PRBI promoter, the GUT2 

10 promoter, the GPDI promoter, and hybrid promoters involving hybrids of parts of 5 1 
regulatory regions with parts of 5' regulatory regions of other promoters or with upstream 
activation sites (e.g. the promoter of EP-A-258 067). 

Convenient regulatable promoters for use in Schizosaccharomycespombe are the 
thiamine-repressible promoter from the nmt gene as described by Maundrell (1990) J. Biol. 

15 Chem. 265, 10857-10864 and the glucose repressible jbpl gene promoter as described by 
Hoffman & Winston (1990) Genetics 124, 807-816. 

Methods of transforming Pichia for expression of foreign genes are taught in, for 
example, Cregg et al (1993), and various Phillips patents (e.g. US 4 857 467, incorporated 
herein by reference), and Pichia expression kits are commercially available from Invitrogen 

20 BV, Leek, Netherlands, and Invitrogen Corp., Sail Diego, California. Suitable promoters 
include AOXI and AOX2; Gleeson etal (1986) J. Gen. Microbiol. 132, 3459-3465 include 
information on Hansenula vectors and transformation, suitable promoters being MOX1 and 
FMD1; whilst EP 361 991, Fleer etal (1991) and other- publications from Rhone-Poulenc 
Rorer teach how to express foreign proteins in Kluyveromyces spp., a suitable promoter 

25 being PGKI. 

The transcription termination signal is preferably the .3' flanking sequence of a 
eukaryotic gene which contains proper signals for transcription termination and 
polyadenylation. Suitable 3 f flanking sequences may, for example, be those of the gene 
naturally linked to the expression control sequence used, i.e. may correspond to the promoter. 
30 Alternatively, they may be different in which case the termination signal of the S. cerevisiae 
ADHI gene is preferred. . . 

The desired albumin fusion protein may be initially expressed with a secretion leader 
sequence, which may be any leader effective in the yeast chosen. Leaders useful in S. 

cerevisiae include that from the mating factor a polypeptide (MF a-1) and the hybrid leaders 

35 of EP-A-387 319. Such leaders (or signals) are cleaved by the yeast before the mature 
albumin is released into the surrounding medium. Further such leaders include those of S . 
cerevisiae invertase (SUC2) disclosed in JP 62-096086 (granted as 911036516), acid- 
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phosphatase (PH05), the pre-sequence of MFa-1, 0 glucanase (BGL2) and killer toxin; S. 

diastaticus glucoarnylase II; S. carlsbergensis a-galactosidase (MEL1); K, lactis killer toxin; 
and Candida glucoarnylase. 

5 Additional Methods of Recombinant and Synthetic Production of Albumin 
Fusion Proteins . 

The present invention also relates to vectors containing a polynucleotide encoding an 
albumin fusion protein of the present invention, host cells, and the production of albumin 
fusion proteins by synthetic and recombinant techniques. The vector may, be, for example, a 

10 phage, plasmid, viral, or retroviral vector. Retroviral vectors may be replication competent or 
replication defective. In the latter case, viral propagation generally will occur only in 
complementing host cells. 

The polynucleotides encoding albumin fusion proteins of the invention may be joined 
to a vector containing a selectable marker for propagation in a host. Generally,, a plasmid 

15 vector is introduced in a precipitate, such as a calcium phosphate precipitate, or in a complex 
with a charged lipid. If the vector is a virus, it may be packaged in vitro using an appropriate 
packaging cell line and then transduced into host cells. 

The polynucleotide insert should be operatively linked to an appropriate promoter, 
such as the phage lambda PL promoter, the E. coli lac, trp, phoA and tac promoters, the 

20 SV40 early and late promoters and promoters of retroviral LTRs, to name a few. Other 
suitable promoters will be known to the skilled artisan. The expression constructs will further 
contain sites for transcription initiation, termination, and, in the transcribed region, a 
ribosome binding site for translation. The coding portion of the transcripts expressed by the 
constructs will preferably include a translation initiating codon at the beginning and a 

25 termination codon (UAA, UGA or UAG) appropriately positioned at the end of the 
polypeptide to be translated. 

As indicated, the expression vectors will preferably include at least one selectable 
marker. Such markers include dihydrofolate reductase, G418, glutamine synthase, or 
neomycin resistance for eukaryotic cell culture, and tetracycline, kanamycin or ampicillin 

30 resistance genes for culturing in E. coli and other bacteria. Representative examples of 
appropriate hosts include, but are not limited to, bacterial cells, such as E. coli, Streptomyces 
and Salmonella typhimurium cells; fungal cells, such as yeast cells (e.g., Saccharomyces 
cerevisiae or Pichia pastoris (ATCC Accession No. 201178)); insect cells such as Drosophila 
S2 and Spodoptera Sf9 cells; animal cells such as CHO, COS,NSO, 293, and Bowes 

35 melanoma cells; and plant cells. Appropriate culture mediums and conditions for the above- 
described host cells are known in the art. 

Among vectors preferred for use in bacteria include pQE70, pQE60 and pQE-9, 
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